SRE
Job Title: SRE
Location:Alpharetta, GA
Duration:Contract
Term:6+ months
Job Description:
Experience Desired: 10+ Years.
Key Responsibilities:
- Design, implement, and manage scalable infrastructure on Google Cloud Platform.
- Develop and maintain infrastructure as code using Terraform.
- Operate and manage containerized applications in Kubernetes (GKE or self-managed).
- Build and maintain monitoring, logging, and alerting systems to ensure high availability and performance.
- Collaborate with engineering teams to optimize system reliability, scalability, and performance.
- Conduct incident response, postmortems, and drive continuous improvements.
- Implement CI/CD pipelines and advocate for DevOps best practices.
- Ensure security, compliance, and best practices across all environments.
Required Skills and Qualifications:
- Proven experience in Site Reliability Engineering or a similar role.
- Strong hands-on experience with Google Cloud Platform (Google Cloud Platform) services.
- Proficiency in Terraform and infrastructure-as-code principles.
- Solid experience managing and deploying applications using Kubernetes.
- Familiarity with CI/CD pipelines and tools like Jenkins, GitLab CI, or GitHub Actions.
- Experience with observability tools (e.g., Prometheus, Grafana, Stackdriver, Datadog).
- Good understanding of networking, Linux systems, and cloud security practices.
- Ability to write automation scripts in languages such as Bash, Python, or Go.
Key Skills:
SRE, Google Cloud Platform, Python, Bash, Kubernetes, Terraform