Full-Time Staff Site Reliability Engineer
Experian is hiring a remote Full-Time Staff Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Costa Rica based applicants remotely. Read complete job description before applying.
Experian
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
We are expanding our Site Reliability Engineering (SRE) team to offer global coverage. We believe everything should be automated and software should run software, embracing the SRE model.
Platform: We use the latest technology: Kubernetes, containers, pipelines, and monitoring. You'll report to the SRE Manager.
Responsibilities:
- Uptime: Ensure Experian One's cloud SaaS uptime.
- Monitoring: Monitor the platform and provide alerts.
- Incident Response: Respond to incidents and restore service.
- Issue Resolution: Gain system knowledge to assess issues and find owners for resolutions.
- Automation: Identify and automate manual processes.
- Incident Management: Coordinate others and restore availability during disruptions.
- Complex Queries: Write complex queries using multiple tools.
- System Design Review: Review system designs to address resiliency, scalability, and monitoring issues.
Technical Skills: Kubernetes, Infrastructure as Code, high availability principles, and experience with tools like Splunk, Dynatrace, Thousand Eyes, ServiceNow, Jira, Jenkins, Python, Prometheus.
Experience: 5+ years of experience supporting complex, scaled systems in production. Strong Linux skills, and cloud native application design.
Requirements: Linux, Networking, incident management, blameless post-incident reviews (PIRs). Proficiency in at least one programming language.
Culture: Customer-obsessed, geographically diverse team. Strong written and verbal English fluency required.
Benefits: Competitive benefits, home-based role.
Important Note: No relocation is available.