Full-Time Senior Machine Learning Platform/Ops Engineer
AUTO1 Group is hiring a remote Full-Time Senior Machine Learning Platform/Ops Engineer. The career level for this job opening is Experienced and is accepting Worldwide based applicants remotely. Read complete job description before applying.
AUTO1 Group
Job Title
Posted
Career Level
Career Level
Locations Accepted
Share
Job Details
- Own the ML lifecycle: Design, implement, and maintain robust, containerized, and reproducible pipelines for model training, evaluation, and deployment—across both batch and real-time settings.
- Operationalize models at scale: Build and manage ML services, APIs, and model serving infrastructure using tools like MLflow, Amazon SageMaker, and Feature Store.
- Automate and monitor: Set up and maintain monitoring, observability, and alerting systems to ensure high availability and performance (including model/data drift, feature logging, and inference latency).
- Accelerate experimentation: Develop and maintain internal libraries, templates, and platform tooling to improve reproducibility and simplify deployment workflows for all model teams.
- Ensure reliability and quality: Implement CI/CD pipelines for model and data workflows using Docker, Terraform, and Jenkins and share best practices, mentor less experienced engineers, and foster strong collaboration across teams.
- Stay current: Continuously evaluate emerging MLOps technologies to improve efficiency, scalability, and reliability.
Hands-on MLOps experience: 2+ years production experience operationalizing, deploying, monitoring, and maintaining ML models at scale.
Tooling: Proficient with infrastructure-as-code, CI/CD systems (Docker, Terraform, Jenkins, or equivalent), and at least one major cloud provider (AWS, GCP, or Azure).
Programming: Strong Python skills (including ML libraries such as scikit-learn, LightGBM, PyTorch, TensorFlow; plus experience with SQL).
Monitoring: Familiar with monitoring and logging for ML pipelines (model drift detection, data validation, performance/feature logging).
Collaboration: Effective communicator with experience partnering across engineering and data science.
Bonus: Experience with feature stores, model version management, or building internal ML platforms.