Job Description:
We are looking for passionate and hardworking DevOps / Site Reliability Engineers (SRE) to work with one of our leading customers.
Responsibilities:
- As an SRE, you’ll need to solve these problems using data, teamwork, and your expertise.
- SREs own platform stack of the production environment; from covering issues related to the application functionalities to addressing infrastructure disasters — our responsibilities are both broad and deep.
- The platform runs on the Microsoft Azure cloud and follows a microservices architecture that consists of open source, vendor licensed, and internally developed tools to perform functions such as provisioning, software deployment, logging, and monitoring.
Requirements:
- Strong sense of ownership and integrity demonstrated through clear communication and collaboration.
- Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment.
- The ability to design, author, and release code in .NET and C# language.
- Strong drive to automate manual operations and to improve them through repeated iteration.
- Good understanding of various Azure cloud services and resources.
- Familiarity with microservices architecture and container orchestration with Kubernetes.
- Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Puppet, Chef, Ansible, and Spinnaker).
- Experience with deploying, supporting, and supervising new and existing services, platforms, and application stacks.
- Familiarity with Continuous Integration and Continuous Delivery processes and a solid understanding of the software development life cycle.
- Excellent troubleshooting and problem-solving skills.
- Experience with scale testing, disaster recovery, and capacity planning.
- BS/MS in Computer Science or equivalent (software development or production operations experience in a large-scale environment).
- Willingness to participate in on-call rotation.