Site Reliability Engineer
Job Title: Site Reliability Engineer
Location:San Jose, CA
Duration: / Term:6+ months
Job Description:
Experience Desired: 14+ Years
Job Description:
- We seek a highly skilled and dynamic Site Reliability Engineer Consultant In this role you will
- Maintain and improve the reliability, performance, and availability of software systems.
- Act as a bridge between traditional IT operations and software development, bringing a software engineering approach to system administration.
Job Responsibilities
- Creating and supporting automation scripts (shell/ansible/python) for infrastructure deployments, validations and monitoring to improve operational tasks
- Scheduling monitoring scripts using cron and airlfow
- Monitoring using tools including Dynatrace, Apica, Grafana etc
- Database handling
- Build CICD pipelines
- Incident handling and problem management
Required Experience
- 14 plus years of IT Infrastructure experience
- Extensive experience working with linux flavors like rhel/centos os, shells, filesystems and utilities
- Experience in programming languages like Python, ansible
- Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and rancher kubernetes and good knowledge on kubernetes objects
- Experience working with Storage preferable: volume, aggregates, backups, DR planning
- Experience scheduling monitoring scripts using cron and airlfow
- Experience with monitoring tools including Dynatrace, Apica, Grafana etc
- Database knowledge including sql and nosql dbs
- Experience building CICD pipelines (preferred)
- Cloud platform knowledge (specifically AWS) is required
Key Skills:
SRE, Angular, Cloud Technology, Deployment, IAC, scripting areas, Terraform, Kubernetes, Ansible, Python