Full-Time Senior Cloud Infrastructure Engineer
Waabi is hiring a remote Full-Time Senior Cloud Infrastructure Engineer. The career level for this job opening is Senior Manager and is accepting USA, Canada based applicants remotely. Read complete job description before applying.
Waabi
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
Waabi, founded by AI pioneer and visionary Raquel Urtasun, is an AI company building the next generation of self-driving technology.
Waabi is looking for a Senior/Staff Cloud Infrastructure Engineer to help improve ML infrastructure and support company engineering and research efforts.
You will:
- Work alongside engineers and researchers using an AI-first approach to enable safe self-driving at scale.
- Collaborate with cross-functional teams to understand cloud usage needs and pain points.
- Propose cloud strategies for compute and data usage in training and simulation workloads.
- Design and implement scalable and resilient cloud infrastructure for long-term reliability and adaptability.
- Devise and promote best practices for cloud usage in training and simulation environments, overseeing cloud strategies company-wide.
Qualifications:
- BS, MS/PhD in Computer Science or similar technical field, or equivalent practical experience.
- 5+ years of relevant industry experience.
- Experience in reading and developing production-quality software.
- Deep understanding of cloud compute and data storage for distributed training and inference workloads.
- Familiarity with Python, GO, Rust, or C++ ecosystems.
- Experience with public cloud platforms (AWS preferred).
- Experience with infrastructure as code systems (Terraform preferred).
- Experience in job scheduling and resource allocation.
- Experience with containers and container orchestration (Docker, ECS, Kubernetes).
- Experience and comfort working with Linux systems.
- Experience building platform services enabling other teams.
- Open-minded and collaborative team player.
- Passionate about self-driving, solving hard problems, and creating innovative solutions.
- Experience working in an Agile/Scrum environment.
Bonus Qualifications:
- Experience with on-premise servers, network equipment, and scale-out storage systems.
- Experience with CI/CD pipelines and release management.
- Experience with common ML tools, workflows, and frameworks (Kubeflow, MLFlow).
- Understanding of system performance tuning at software, hardware, and network levels.