Full-Time Site Reliability Engineer
Xplor is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting St. Louis, MO based applicants remotely. Read complete job description before applying.
Xplor
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
Site Reliability Engineering (SRE) is what you get when you treat operations as if it is a software problem.
Our mission is to protect and provide for the software and systems behind Xplor’s services with an ever-watchful eye on their availability, latency, performance, and capacity. As an SRE you will use your experience in running production-grade software to build automation that allow our systems to run smoothly with minimal intervention.
This is an ideal role for someone who wants to work through problems and provide solutions that have significant positive impact to our products and customers.
Essential duties and responsibilities include:
- Automate pipelines to production and make it safe, secure, and repeatable so that development teams can self-service
- Configure and manage production workloads in Azure, AWS, and on-premise datacenters
- Construct and maintain infrastructure-as-code using Terraform, Puppet, Ansible
- Solve problems relating to mission critical services and build automation to prevent problem recurrence with the goal of automating response to all non-exceptional service conditions
- Influence and create new designs, architectures, standards, and methods for large-scale distributed systems
- Build world-class alerting, monitoring, and capacity management systems for the management of our platform’s health
- Participate in on-call rotations as needed
Technologies:
- Languages: Python, Bash, Powershell, C#, Java, Terraform Configuration Language
- Workload management: Azure, AWS, container orchestration
- CI/CD: Azure DevOps
- Configuration Management: Ansible, Puppet
- Infrastructure as Code: Terraform
- Monitoring: Coralogix, Uptrends, OpenTelemetry
Requirements:
- Five years of experience in a software development and/or operations role
- Intermediate experience with CI/CD pipeline technologies
- Professional experience in containerization technologies
- Intermediate experience with scripting (Powershell, Bash, or Python preferred)
- Foundational understanding of API endpoints
- Ability to troubleshoot running applications in a systematic manner
- Comfortable working with applications on-prem and in Azure
- Working proficiency with Git
- Comfortable with Scrum-based development lifecycle
- Excellent analytical and problem-solving skills
- Strong and clear written and verbal communication skills
- Ability to translate complex ideas into diagrams, user stories, and ultimately working software
- Strong sense of ownership around producing results