Full-Time Site Reliability Engineer
NBCUniversal is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Englewood Cliffs, NEW JERSEY based applicants remotely. Read complete job description before applying.
NBCUniversal
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
NBCUniversal has an opening for a Site Reliability Engineer focused primarily on supporting live channel distribution on the Video Streaming Engineering team.
This position will be part of a dedicated 24x7 team supporting and maintaining distribution systems, diagnosing and preventing on-air issues.
Responsibilities include:
- Investigating broadcast system issues to find root causes.
- Driving investigations of broadcast issues, reporting findings to leadership and operations.
- Following up with team members and vendors if needed, driving vendors for root cause and solutions.
- Creating documentation outlining issues, root causes, and resolution steps.
- Assisting with deployment and testing of patches/fixes.
- Assisting with design, analysis, or evaluation of projects.
- Supporting on-air systems integration and rollout.
- Providing 24x7 on-air system support, including on-call support during rollouts and special events.
- Attending daily maintenance and operations review calls to report on issues and fixes.
Basic Requirements:
- BS in Engineering/Computer Science.
- Passion for investigating issues and problem-solving.
- 5+ years DevOps/SRE experience in a high-traffic cloud environment (AWS preferred).
- 5+ years in a support/analysis role.
- Experience with deployment automation in AWS (Cloud Formation, Terraform, Ansible).
- Familiarity with containerization and orchestration services (Kubernetes, Docker).
- Familiarity with CI/CD orchestration tools (GitHub Actions, Jenkins).
- 5+ years of Linux System Administration.
- 5+ years coding in Go, Python, Ruby, Java, or shell languages.
- Experience with modern log/metric aggregation software (Cloudwatch, Elasticsearch + Kibana, Splunk, Grafana).
- Experience with continuous delivery/frequent releases.
- Methodical and logical problem-solving skills.
- Willingness to prioritize business needs.
- Working knowledge of the OSI model, troubleshooting networking issues.
Desired Characteristics:
- 3+ years experience in Media & Entertainment.
- 3+ years in 24x7 production environments.
- 3+ years supporting IT/Broadcast Systems.
- 5+ years customer-facing experience.
- Experience with Live TV Broadcasting, OTT Streaming, codecs, and ARQ technologies (a plus).