Full-Time Senior Systems Engineer Autonomous Vehicle Infrastructure
2100 NVIDIA USA is hiring a remote Full-Time Senior Systems Engineer Autonomous Vehicle Infrastructure. The career level for this job opening is Senior Manager and is accepting US, CA, Santa Clara based applicants remotely. Read complete job description before applying.
2100 NVIDIA USA
Job Title
Posted
Career Level
Career Level
Locations Accepted
Salary
Share
Job Details
Autonomous vehicle (AV) infrastructure group builds foundational infrastructure and tools to enable NVIDIA's AV program.
We are seeking a motivated Senior Engineer to join our team in building and scaling our cloud-native infrastructure powering 100s of micro-services and large scale HPC clusters (15k+ GPUs).
Your role will be critical in driving infrastructure innovation across our organization.
What you'll be doing:
- Develop, operate and maintain tooling and automation to enhance developer productivity and operational efficiency for the org
- Lead the development of infrastructure automation frameworks and CI/CD pipelines, ensuring robust, scalable, and secure cloud-native applications deployment
- Engage directly with engineering users to understand their needs and improve their experience by recommending robust, scalable cloud solutions
- Contribute to the design and architecture of the cloud infrastructure, traffic and networking components to meet the evolving needs of our internal developer platform
- Play a pivotal role in improving cloud infrastructure and services reliability and performance
- Troubleshoot complex production issues
What we need to see:
- BS/MS in Computer Science, Engineering or STEM related field (or equivalent experience)
- 8+ years of professional experience in related field
- Strong programming fundamentals with expertise in Go and Python
- Experience developing and operating micro-services at scale
- Good understanding of the SRE best practices, alerting and observability
- Advanced Kubernetes workload management expertise, including traffic management, deployment strategies, observability, and security
- Strong Infrastructure as Code (IaC) fundamentals with experience in developing infrastructure CI/CD pipelines, automation frameworks, and IaC libraries
Ways to stand out from the crowd:
- Motivated self-starter with an equal balance of strong problem-solving skills and customer-facing communication skills
- Excellent written and verbal interpersonal skills
- Contributions to open-source projects
- Previous experience with building sophisticated tooling and SRE automation on the large GPU/CPU clusters
- Deep AWS expertise across core services (VPC, IAM, EC2, S3, RDS, CloudFront, EKS) with proven experience in designing and managing scalable cloud infrastructure