Full-Time Sr. Cloud Site Reliability Engineer

Serve Robotics is hiring a remote Full-Time Sr. Cloud Site Reliability Engineer. The career level for this job opening is Senior Manager and is accepting USA based applicants remotely. Read complete job description before applying.

This job was posted 9 months ago and is likely no longer active. We encourage you to explore more recent opportunities on our site. However, you may still try your luck using 'Apply Now' link below. We recommend focusing on newer listings available here.

Serve Robotics

Job Title

Sr. Cloud Site Reliability Engineer

Posted

9 months ago on 18th February 2025

Career Level

Full-Time

Career Level

Senior Manager

Locations Accepted

USA

Job Details

At Serve Robotics, we’re reimagining how things move in cities. Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses. The Serve fleet has been delighting merchants, customers, and pedestrians along the way in Los Angeles while doing commercial deliveries.

We’re looking for talented individuals who will grow robotic deliveries from surprising novelty to efficient ubiquity. We are tech industry veterans in software, hardware, and design who are pooling our skills to build the future we want to live in. We are solving real-world problems leveraging robotics, machine learning and computer vision, among other disciplines, with a mindful eye towards the end-to-end user experience. Our team is agile, diverse, and driven. We believe that the best way to solve complicated dynamic problems is collaboratively and respectfully.

This is a senior-level, individual contributor position. You will balance hands-on responsibilities—building and maintaining critical SRE tooling and processes - with technical leadership - guiding architecture decisions, mentoring others in SRE practices, and steering strategic initiatives to enhance system resiliency and availability.

You’ll collaborate across engineering, product, and operations teams to ensure our systems meet strict uptime and performance goals, all while aligning with overarching business objectives.

Responsibilities

Instrumentation & Monitoring: Develop and refine monitoring and observability tools (metrics, logs, traces) to validate system availability and performance. Implement best practices for instrumentation using tools like Prometheus, Grafana, Datadog, or equivalent.
Reliability Engineering: Collaborate with development teams to design and implement solutions for higher availability in the cloud. Lead the definition and management of Service Level Indicators (SLIs) and Service Level Objectives (SLOs), ensuring alignment with business goals. Perform capacity planning, load testing, and performance tuning to ensure systems can handle projected traffic and workloads.
Incident Response & Prevention: Own the incident response process, including on-call rotation, alerts, and root cause analysis. Proactively identify reliability risks and propose mitigations to reduce system downtime. Conduct and facilitate postmortems to capture learnings, drive improvements, and prevent recurrence of issues.
Align System Health with Business Metrics: Map system availability metrics to direct business value, ensuring stakeholders understand how reliability impacts overall company objectives. Create reporting dashboards that connect reliability data with KPIs and business goals.
Technical Leadership & Mentorship: Serve as an in-house SRE expert, advising teams on reliability-oriented designs, coding practices, and testing methodologies. Mentor junior and mid-level engineers, fostering a culture of continuous learning, automation, and operational excellence.
Collaboration & Education: Work closely with engineering, product, and operations teams to advocate for SRE best practices. Conduct training sessions and share knowledge to build a culture of reliability throughout the organization.

Skills

Automation & IaC CI/CD Cloud Containers & Orchestration Observability Tools

FAQs

What is the last date for applying to the job?

The deadline to apply for Full-Time Sr. Cloud Site Reliability Engineer at Serve Robotics is 20th of March 2025 . We consider jobs older than one month to have expired.

Which countries are accepted for this remote job?

This job accepts [ USA ] applicants. .

Apply Now

Related Jobs You May Like

Azure DevOps Engineer

Jersey City, NJ

2 days ago

.NET

Azure

DevOps

Derex Technologies Inc

Full-Time

Experienced

Lead Palantir Developer

Seattle, WA

2 days ago

CI/CD Pipelines

Data Engineering

Palantir Foundry

Logic20/20 Inc.

Full-Time

Experienced

YEAR $156750 - $173329

Cloud AppOps Engineer

Atlanta, GA

3 days ago

Application Support

AWS

Cloud Services (EC2, S3, IAM, ELB, VPC, VPN)

Sutherland

Full-Time

Experienced

Staff DataOps Engineer

Remote, India

3 days ago

AWS

CI/CD

DataOps

Nagarro

Full-Time

Experienced

Query Tuning Specialist - Database Performance - Postgre

Austin, Texas

3 days ago

Database Management

Performance Tuning

Problem-solving

ServiceNow

Full-Time

Experienced

DevOps Engineer, Playout

New York, New York

3 days ago

CICD

Cloud Services (AWS, GCP, Azure)

DevOps

NBCUniversal

Full-Time

Experienced

YEAR $90000 - $110000

Query Tuning Specialist - Database Performance - Postgres

Austin, Texas

3 days ago

Database Management

Performance Tuning

SaaS/PaaS/Cloud Development

ServiceNow

Full-Time

Experienced

Lead Palantir Developer

Seattle, WA

4 days ago

CI/CD Pipelines

Cloud ETL

Palantir Foundry

Logic20/20 Inc.

Full-Time

Experienced

YEAR $156750 - $173329

Cloud AppOps Engineer

Atlanta, GA

4 days ago

Application Support

AWS

Cloud Security

Sutherland

Full-Time

Experienced

Site Reliability Engineer

Stamford, Connecticut

4 days ago

Cloud Platforms (AWS, GCP, Azure)

Configuration Management

Monitoring And Alerting Tools

NBCUniversal

Full-Time

Experienced

YEAR $110000 - $145000

Senior Cloud Platform Engineer (Networking)

Berlin, Germany

5 days ago

AWS

Networking

Scalable GmbH

Full-Time

Experienced

DevOps Engineer

Texas

5 days ago

AWS

GitLab

Kubernetes

InfStones

Full-Time

Experienced

All Remote Jobs

Full-Time Sr. Cloud Site Reliability Engineer

Serve Robotics

Job Title

Posted

Career Level

Career Level

Locations Accepted

Share

Job Details

Skills

FAQs

What is the last date for applying to the job?

Which countries are accepted for this remote job?

Related Jobs You May Like

Azure DevOps Engineer

Lead Palantir Developer

Cloud AppOps Engineer

Staff DataOps Engineer

Query Tuning Specialist - Database Performance - Postgre

DevOps Engineer, Playout

Query Tuning Specialist - Database Performance - Postgres

Lead Palantir Developer

Cloud AppOps Engineer

Site Reliability Engineer

Senior Cloud Platform Engineer (Networking)

DevOps Engineer

Looking for a specific job?