Full-Time Senior Site Reliability Engineer
Arista Networks is hiring a remote Full-Time Senior Site Reliability Engineer. The career level for this job opening is Experienced and is accepting USA based applicants remotely. Read complete job description before applying.
Arista Networks
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
As a Senior SRE, you’ll be responsible for our global CloudVision service fleet. This includes:
- Building the CI/CD lifecycle for services, from inception and design to deployment and scaling
- Improving operational processes through automation
- Identifying key service indicators to be used in capacity planning
- Owning disaster recovery and management
- Driving infrastructure and cloud-based application security design
- Leading sustainable incident response and blameless postmortems
- Being an active member of our globally distributed on-call team
Arista’s CloudVision is an enterprise network management and streaming telemetry SaaS offering. CloudVision is deployed on Kubernetes across global regions using Spinnaker for our CI/CD pipeline. Our tech stack runs on GKE, using HBase/Hadoop as main distributed database and storage layer, ElasticSearch for powering search data, ClickHouse for fast real time queries of flow data, our own Kafka-based distributed real time stream processing layer for analytics, and TensorFlow for ML analysis. Our monitoring system is built on top of Prometheus, Grafana, Loki, and other OSS tools.
Qualifications:
BS/MS degree in Computer Science or a relevant experience subject.
5+ years software engineering experience.
Experience developing or managing deployments of distributed database systems or scale out applications for a SaaS environment.