Site Reliability Engineer

Job Title: Site Reliability Engineer

Location:San Jose, CA

Duration: / Term:6+ months

Job Description:

Experience Desired: 14+ Years

Job Description:

We seek a highly skilled and dynamic Site Reliability Engineer Consultant In this role you will
Maintain and improve the reliability, performance, and availability of software systems.
Act as a bridge between traditional IT operations and software development, bringing a software engineering approach to system administration.

Job Responsibilities

Creating and supporting automation scripts (shell/ansible/python) for infrastructure deployments, validations and monitoring to improve operational tasks
Scheduling monitoring scripts using cron and airlfow
Monitoring using tools including Dynatrace, Apica, Grafana etc
Database handling
Build CICD pipelines
Incident handling and problem management

Required Experience

14 plus years of IT Infrastructure experience
Extensive experience working with linux flavors like rhel/centos os, shells, filesystems and utilities
Experience in programming languages like Python, ansible
Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and rancher kubernetes and good knowledge on kubernetes objects
Experience working with Storage preferable: volume, aggregates, backups, DR planning
Experience scheduling monitoring scripts using cron and airlfow
Experience with monitoring tools including Dynatrace, Apica, Grafana etc
Database knowledge including sql and nosql dbs
Experience building CICD pipelines (preferred)
Cloud platform knowledge (specifically AWS) is required

Key Skills:

SRE, Angular, Cloud Technology, Deployment, IAC, scripting areas, Terraform, Kubernetes, Ansible, Python

Apply for job