Senior Site Reliability Engineer
Job Title: Senior Site Reliability Engineer / DevOps Engineer
Location:Bothell, WA
Duration:Contract
Term:6 months
Job Description:
Experience Desired: 7 Years.
Key Responsibilities
Platform Reliability & Operations
- Own reliability, availability, scalability, and performance of API Gateway services running on Kubernetes
- Design and implement SRE best practices including SLIs, SLOs, SLAs, error budgets, and incident management
- Lead production readiness reviews, root cause analysis (RCA), and post-incident improvements
- Drive capacity planning, performance tuning, and resilience testing
- Kubernetes & Cloud Engineering
- Manage and optimize Kubernetes clusters (EKS / AKS / GKE / On-prem)
- Develop and maintain Helm charts, manifests, and deployment strategies
- Implement rollout strategies such as blue-green, canary, and rolling deployments
- Collaborate with development teams to ensure cloud-native design patterns
- Observability & Monitoring (Strong Focus)
- Build and maintain enterprise-grade observability (O11y) solutions:
- Prometheus & Grafana for metrics and dashboards
- Splunk for centralized logging and alerting
- OpenTelemetry for distributed tracing
- Define actionable alerts and dashboards for platform and application health
- Improve MTTR through better visibility and automation
- CI/CD & Automation
- Design and maintain CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, etc.)
- Automate infrastructure using Infrastructure as Code (Terraform, CloudFormation, etc.)
- Develop automation scripts using Python, Bash, or Groovy
- Security & Compliance
- Implement DevSecOps practices including secrets management, image scanning, and RBAC
- Work closely with security teams on vulnerability remediation and compliance controls
- Innovation & POCs
- Actively contribute to POCs for AI Gateway / Intelligent API Gateway initiatives
- Evaluate and prototype integrations with AI/ML-driven routing, observability, and security features
- Stay current with emerging SRE, cloud, and AI gateway technologies
Soft Skills
- Strong troubleshooting and problem-solving skills
- Ability to work cross-functionally with developers, architects, and security teams
- Proactive mindset with a passion for automation and reliability
- Good documentation and communication skills
Key Skills:
SRE, Devops, Java, Kubernetes, Observability