Full-Time Senior Site Reliability Engineer
Experian is hiring a remote Full-Time Senior Site Reliability Engineer. The career level for this job opening is Senior Manager and is accepting United States based applicants remotely. Read complete job description before applying.
Experian
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Our Infrastructure team is looking to add a Senior Site Reliability Engineer (SRE). As a Senior SRE, you are responsible for the availability, efficiency, and resilience of our software systems.
You will divide your time between system operations responsibilities, developing software and tools that help increase system reliability and performance, and leading projects to improve the health of Cloud operations.
You will collaborate with software engineering teams and use similar technologies related to their software deliverables' design, deployment, and continued operations.
You will report to the Manager of Site Reliability Engineering, and the role can be worked remotely in the US.
- Apply software engineering principles and best practices to system operations and administration.
- Design and implement tools to enhance the reliability, performance, and operations of commercial software systems.
- Lead configuration, testing, security, and deployment efforts for project work.
- Contribute to internal libraries, frameworks, and occasionally to open source projects.
- Monitor and improve runtime characteristics of department-wide software systems.
- Participate in on-call support rotations.
- Uphold Software Development Life Cycle quality standards by promoting DevOps tools and abstractions.
- Collaborate with project managers and other partners.
- Mentor teammates and lead technical interviews.
- Develop infrastructure abstraction layers to empower engineering teams.
Tech Stack
- Cloud Platforms: Google Cloud Platform (GCP), Amazon Web Services (AWS)
- Infrastructure as Code: Terraform, Atlantis
- CI/CD & Artifact Management: GitHub Actions, Harness, Nexus
- Containerization & Orchestration: Kubernetes, Helm
- Workflow Orchestration: Airflow, Cloud Composer, GKE
- Data & Messaging: BigQuery, CloudSQL, Pub/Sub
- ML Pipelines & Serverless: Kubeflow Pipelines, Cloud Run, Cloud Functions
- Monitoring & Visualization: Google Cloud Logging (Stackdriver), Looker
- Languages: Python, Golang, Scala
Experience Requirements
- 5+ years in a cloud-based infrastructure role with development and automation experience, at least 2 years with GCP
- 5+ years in a DevOps role with scripting/automation experience
- 3+ years of experience using Terraform in a cloud environment
- Experience developing Terraform modules
- Shell, Python scripting abilities, familiarity with Golang
- Experience with CI/CD workflows and SDLC tools (ex. Github actions, Jenkins, Harness)
- Cloud architecture (network, storage, compute, messaging)
- Passionate about best practices and reliable, sustainable, scalable environments
- Experience or interest in: Airflow, Kubernetes, OPA / REGO, Go / Python