Full-Time Senior DevOps Engineer
DataRobot Software Private Limited is hiring a remote Full-Time Senior DevOps Engineer. The career level for this job opening is Expert and is accepting India based applicants remotely. Read complete job description before applying.
DataRobot Software Private Limited
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
DataRobot delivers AI that maximizes impact and minimizes business risk. Our platform and applications integrate into core business processes so teams can develop, deliver, and govern AI at scale. DataRobot empowers practitioners to deliver predictive and generative AI, and enables leaders to secure their AI assets.
As a Senior DevOps Engineer in the Control Plane Services team, you will help us deliver the mission of building, running, and owning the lifecycle of the core infrastructure for DataRobot Saas.
Our team owns Kubernetes implementations across SaaS and enterprise solutions within the company and builds the platform that powers high-quality, rapid software delivery within DataRobot.
You will collaborate with technically skilled colleagues to address complex technical challenges and build internal products, services, and automation that establish the infrastructure and fleet management for the DataRobot AI SaaS Platform.
You will evaluate and influence the ongoing design, architecture, standards, and methodologies for operating services and systems.
The solutions you build will be high-impact and rolled out to our customer base across the world.
Key responsibilities:
- Develop a fully-featured Kubernetes platform built around industry standards to improve developer experiences and enable self-service capabilities
- Work closely with internal application teams to improve their Kubernetes onboarding experience
- Work with app teams to understand their potential challenges and help them choose the best way to architect their systems on Kubernetes
- Design and implement new platform features to meet business and internal team goals
- Monitor and maintain the performance and reliability of the existing Kubernetes platform clusters, and identify and troubleshoot any issues that may arise
- Closely follow trends in the Kubernetes community and take advantage of new technologies as they emerge
- Work with the customers, and stakeholders to understand their needs and build the right products and solutions
- Take an active part in the strategy and roadmap definition and prioritization
- Seek, give, and receive feedback in a constructive manner, including but not limited to code reviews
- Engage in engineering on-call escalated support of services owned by the team
Knowledge, Skills, and Abilities:
- 7+ years of proven experience with high-quality infrastructure solutions in a collaborative environment including coding standards, code reviews, source control management, build processes, testing, and operations experience
- 3+ years of experience building infrastructure solutions in at least one major cloud provider (AWS, Azure, or GCP)
- Expert proficiency in Kubernetes. Experience in building and running software systems on Kubernetes clusters in production
- Expert proficiency in Kubernetes architecture and operations including resource management scheduling, auto-scaling and cluster networking
- Hands-on experience with infrastructure provisioning and configuration using Infrastructure as Code (IaC) principles
- Hands-on experience in developing wide variety of software and automation scripts with Python/Golang
- Experience designing and operating diverse CI/CD pipelines with Harness.io or similar platforms such as Github Actions, Gitlab CI, JenkinsX or ArgoCD
- Deep understanding of core computer science — including operating systems, distributed systems, networking, and concurrent programming
- Experience and insight into designing, implementing, and supporting highly scalable cloud services from the ground up
- Aptitude to deal with ambiguity, and enthusiasm to help tackle difficult issues
- Ability to work effectively asynchronously and face-to-face in a multicultural team in multiple timezones around the world
- Excellent critical thinking skills and ability to objectively evaluate multiple solutions with different tradeoffs
Nice to have:
- Open-source contributions
- Experience building Kubernetes operators
- Experience in building Infrastructure Platforms
- Expert in developing a wide variety of software with Python/Golang
Possible employment type: full_time
Possible allowed location: India