Full-Time Staff Site Reliability Engineer
Sailpoint Technologies is hiring a remote Full-Time Staff Site Reliability Engineer. The career level for this job opening is Expert and is accepting USA based applicants remotely. Read complete job description before applying.
Sailpoint Technologies
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
As a Staff Site Reliability Engineer (Staff SRE) at SailPoint, you will be a key member on our Reliability Engineering team, driving reliability practices servicing the Identity Security Cloud platform.
You are immensely passionate about reliability practices and operational excellence.
Responsibilities:
- Make it easy for everyone to create, consume, manage, and scale reliable cloud production services.
- Keep up with industry trends to improve end-to-end reliability and maintainability for all services.
- Coach engineering teams on observability best practices such as setting up well-defined Service Level Objectives (SLOs).
- Analyze performance of services and recommend infrastructure/code changes that will improve capacity and performance.
- Enable our engineering teams to scale our enterprise operations by providing guidance, best practices, and support as part of an SRE Center of Excellence.
- Manage cross-functional requirements working with Engineering, Product, Services, and other departments.
- Be a mentor of quality for design reviews, code, test cases, automation, observability, root cause analysis, and self-healing.
- Influence architectural design, implementation, consolidation, and simplification for global scale.
- Drive operational excellence to deliver frictionless operation, happy on call, and optimal customer experience.
Qualifications:
- 8+ years experience in SRE or DevOps production operations supporting a highly available environment for SaaS software or cloud service provider.
- Strong proficiency with one or more programming languages (Java, Python, Go, etc.).
- Bachelor's degree in Computer Science or other technical discipline, or equivalent experience is preferred, not required.
Requirements:
- Due to FedRAMP requirements, US Citizenship is required to be considered for this role.
- Experience with cloud infrastructure environments, preferably AWS, and Infrastructure as code, preferably Terraform.
- Strong proficiency with containerization technology and/or Kubernetes.
- In-depth experience with metrics, tracing, and logging observability tools such as Prometheus, Grafana, Honeycomb, and Kibana.
- Experience with incident management, including conducting incident reviews.
- Strong understanding of Linux, software development, systems, networking, and Cloud concepts.
- A positive and collaborative demeanor, combined with the ability to coach, mentor, and delegate.
- Excellent communication skills.
- Life-long learner – you stay up to date with technology trends, spend time learning new technologies, and share your learnings with your team.