Full-Time Site Reliability Engineer
Unitary is hiring a remote Full-Time Site Reliability Engineer. The career level for this job opening is Experienced and is accepting Europe, UK based applicants remotely. Read complete job description before applying.
Unitary
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
Company: We are a rapidly growing startup developing solutions that blend human expertise and AI agents to handle manual customer and marketplace operations tasks. Our unique approach combines the strengths of human expertise (high accuracy and nuanced decision-making) with the advantages of AI automation (speed and cost efficiency). This cutting-edge technology helps businesses solve real-world challenges in trust & safety and beyond without complex technical integration.
Role: We are now looking for a Site Reliability Engineer to ensure our systems run smoothly and reliably at scale. Your expertise in monitoring, observability, and system automation will help maintain the high availability and performance our customers depend on. You will work at the intersection of development and operations, using your technical skills to build robust infrastructure and streamline deployment processes. Your mission is to proactively identify and resolve system issues before they impact our customers.
Responsibilities:
- Design and implement comprehensive alerting systems.
- Collaborate with development teams to ensure observability.
- Optimize on-call processes and create runbooks.
- Build self-healing systems using AI tools.
- Develop automation tools and diagnostic capabilities.
- Ensure secure and reliable code deployment.
- Join 24/7 support rotation.
Requirements:
- Experience with visualization tools (Grafana).
- Proficiency with metrics platforms (Prometheus, InfluxDB, or OpenTelemetry).
- Experience with incident management tools (Incident.io).
- Strong problem-solving skills.
- Production code experience (Go or Python).
- Collaborative working style.
Bonus Skills:
- Experience with fully remote, international teams
- Previous startup experience
- Slack bot or similar automation tools
- Experience with CI/CD platforms (GitLab CI, ArgoCD)
- Experience with Kubernetes and infrastructure as code (Terraform)
- Familiarity with MLOps practices and tools