Site Reliability Engineer

Job Title : Site Reliability Engineer (SRE)

Location : Columbia, MD or Chicago, IL (Hybrid Preferred – 4 days onsite, with flexibility)

Type : Contract-to-Hire (6-12 months with potential for conversion)

About the Role
Join our dynamic Platform Engineering team as a Site Reliability Engineer (SRE). You ll be responsible for ensuring the reliability, scalability, and performance of our systems, working in a fast-paced and collaborative environment. The role is open to both senior engineers (5+ years of experience) and junior engineers (3+ years of experience) looking to grow their skill set.

Key Responsibilities

Design, build, and maintain scalable, reliable infrastructure and services.
Implement monitoring, alerting, and incident response systems to ensure high availability.
Automate repetitive tasks to reduce manual toil and improve system efficiency.
Collaborate with development and DevOps teams to enhance application reliability.
Conduct root cause analysis and post-mortems for production incidents.
Define and track SLOs, SLIs, and error budgets to measure system health.
Participate in on-call rotations and handle incidents promptly.
Continuously enhance system performance, reliability, and cost-efficiency.
Maintain code quality using SonarQube.
Support CI/CD pipelines, with a focus on Harness (training provided).

Required Qualifications

Bachelor s degree in computer science, Engineering, or related field or equivalent experience.
3+ years (junior) / 5+ years (senior) experience in SRE, DevOps, or Systems Engineering.
Strong expertise in SonarQube.
Proficiency in at least one programming language (Python, Go, Java, etc.).
Hands-on experience with cloud platforms (AWS, Google Cloud Platform, or Azure).
Solid Linux systems and networking knowledge.
Experience with containerization (Docker, Kubernetes).
Familiarity with CI/CD tools and infrastructure-as-code (Terraform, Ansible).
Experience with monitoring tools (Prometheus, Grafana, Datadog).

Apply for job