Full-Time Senior Site Reliability Engineer
Coupa is hiring a remote Full-Time Senior Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Poland based applicants remotely. Read complete job description before applying.
Coupa
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
As a Senior Site Reliability Engineer, you will play a crucial role in the development of solutions for our Enterprise platform. You will be developing applications that provide self-service and increased efficiency to a diverse group of internal customers across Cloud Operations, Engineering, Customer Success & Support, and Customer Value Management. When you are successful, you will significantly accelerate the ability of our teams to better serve our customers.
What You'll Do:
- Leverage automation to increase reliability, availability, and performance of the infrastructure
- Ensuring the services and infrastructures are reliable, fault-tolerant, efficiently scalable and cost-effective
- Evaluate products and technologies in the industry that can be applied to optimize Coupa’s internal processes and workflows, simplify our operations, reduce technical debt, eliminate toil
- Coordinate Incident, Problem, Release and Change Management
- Manage, debug and troubleshoot cloud infrastructure issues, and tools used to support tasks
- Leverage observability and monitoring to make informed decisions
- Participate in an on-call routine and comfortable working in 24x7 environment
- 5-7 years of experience in Cloud Operations Support / SRE
- A critical thinker, resourceful, problem-solver who has a passion for applying technology to make work life better
- Hands on experience with Cloud/SaaS architecture using AWS/ GCP a must
- Hands on experience with Observability and Monitoring (NewRelic, OpenTelemetry, Grafana, Dynatrace, Datadog, etc)
- Hands on experience with Log Parsing tools (Splunk, ELK, etc)
- Hands on with one or more programming languages like Ruby, Python, or any object-oriented programming language or scripting skills
- Experience with SDLC, Agile Practices, Kubernetes, Docker
- Hands-on experience with Terraform and configuration management tools like Chef, Ansible or equivalent.
- Expertise in problem-solving and analyzing global scale distributed systems
- Excellent written and verbal communication skills