What are the responsibilities and job description for the MLOps Engineer position at SIERRA AI?
Join us in building the backbone of production-grade AI systems. As an MLOps Engineer, you will be responsible for designing, deploying, and maintaining scalable machine learning infrastructure that powers real-world applications.
You will work at the intersection of machine learning, software engineering, and DevOps—ensuring models move seamlessly from experimentation to reliable production systems. This role is ideal for engineers who enjoy solving complex infrastructure challenges and enabling ML teams to move faster.
You will work at the intersection of machine learning, software engineering, and DevOps—ensuring models move seamlessly from experimentation to reliable production systems. This role is ideal for engineers who enjoy solving complex infrastructure challenges and enabling ML teams to move faster.
- Design, build, and maintain end-to-end ML pipelines.
- Automate model training, validation, and deployment workflows.
- Develop CI/CD pipelines specifically for ML systems.
- Monitor production models for performance, drift, and reliability.
- Manage model versioning, experiment tracking, and reproducibility.
- Collaborate with ML engineers, data scientists, and backend teams.
- Optimize infrastructure for scalability, cost, and performance.
- Ensure best practices in security, governance, and compliance.
- Programming: Python, Bash
- ML Tools: MLflow, Weights & Biases, Kubeflow
- Cloud Platforms: AWS (SageMaker, S3, EC2), GCP (Vertex AI), Azure ML
- Orchestration: Airflow, Prefect
- Containerization: Docker, Kubernetes
- Data Tools: SQL, Spark, Kafka (streaming pipelines)
- CI/CD: GitHub Actions, Jenkins, GitLab CI
- Monitoring: Prometheus, Grafana, ELK Stack
- Key Focus: Build reliable, scalable, and automated ML systems
- Required Skills:
- Strong experience in Python and software engineering fundamentals.
- Hands-on experience with MLOps tools and pipeline automation.
- Experience deploying ML models in production environments.
- Familiarity with cloud platforms (AWS/GCP/Azure).
- Knowledge of containerization and orchestration (Docker, Kubernetes).
- Understanding of ML lifecycle and model evaluation concepts.
- Experience with CI/CD pipelines and version control (Git).
- Valuable Experience (Nice to Have):
- Experience with real-time ML systems or streaming pipelines.
- Familiarity with LLM deployment and inference optimization.
- Knowledge of feature stores and model registries.
- Exposure to distributed systems and large-scale data processing.
- Understanding of monitoring, logging, and observability systems