Site Reliability Engineer

Apolis Logo
  • Engineering
  • FullTime
  • Applications have closed

Role: Site Reliability Engineer

Duration: 12+ Months

Location: Raleigh, NC (Onsite, 5 days)

Primary Skill Required for the Role: Azure Infrastructure

Level Required for Primary Skill: Intermediate (3-5 years experience)

  • Proven expertise in Site Reliability Engineering, with a background in software engineering, infrastructure, or operations.
  • Hands-on experience with cloud platforms (e.g. Azure), operating systems (e.g. Linux RHEL7+, Windows 2019+), and networking fundamentals.
  • Solid understanding of networking and storage technologies (e.g. NFS, SAN, NAS).
  • Strong working knowledge of authentication and naming services (e.g. DNS, LDAP, Kerberos, Centrify).
  • Proficiency in scripting and automation (e.g., Python, Go, Bash).
  • Practical experience with infrastructure as code tools (e.g., Terraform, Ansible).
  • Demonstrated ability to define and manage SLIs, SLOs, SLAs, and to systematically reduce TOIL.
  • Ability to integrate with observability platforms to ensure system visibility.
  • A metrics- and automation-driven mindset, with a strong focus on measurable reliability.
  • Calm under pressure, especially during incidents and outages, with a structured approach to incident response and post-mortems.
  • Strong collaboration and communication skills, with the ability to work across engineering and business teams.
  • A proactive, ownership-driven attitude, always seeking opportunities to improve systems and processes.