What are the responsibilities and job description for the Machine Learning Operations Engineer position at Augment Professional Services?
Location / Worksite
Remote or Hybrid (U.S.-Based Preferred)
Travel as Required for Project Needs
About Augment Professional Services (APS)
Augment Professional Services (APS) partners with leading organizations across the Energy, Technology, Utilities, and Engineering sectors to deliver specialized talent, consulting expertise, and project support. Our mission is to connect experienced professionals with organizations executing complex projects and critical infrastructure initiatives.
Position Overview
The MLOps Engineer is responsible for designing, deploying, and maintaining scalable machine learning solutions in production across multi-cloud and data platform environments. This role plays a critical part in operationalizing machine learning models by building robust pipelines, enabling automation, and ensuring reliability, performance, and governance across AWS, Microsoft Azure, and Snowflake ecosystems.
Working closely with data scientists, data engineers, and cloud platform teams, the MLOps Engineer bridges the gap between model development and production deployment. This position focuses on creating secure, scalable, and cost-efficient ML platforms that support end-to-end lifecycle management, including model training, deployment, monitoring, and continuous improvement.
The ideal candidate brings strong experience in cloud-native architectures, CI/CD automation, and production-grade ML systems, with hands-on expertise in AWS, Azure, and Snowflake environments.
Key Responsibilities include:
- Design and implement end-to-end machine learning pipelines including data ingestion, feature engineering, model training, validation, deployment, and monitoring
- Deploy and manage machine learning models in production across AWS, Azure, and Snowflake platforms
- Build and maintain batch and real-time inference pipelines using cloud-native and platform-native services
- Develop and automate CI/CD pipelines for model packaging, testing, deployment, and rollback
- Integrate ML workflows with services such as AWS SageMaker, AWS Lambda, Azure Machine Learning, Azure Data Factory, and Snowflake
- Build and manage orchestration workflows using tools such as Apache Airflow, Azure Data Factory, or similar platforms
- Implement model lifecycle management practices including experiment tracking, model registry, and governance frameworks
- Monitor model performance, including accuracy, drift, latency, throughput, and pipeline reliability
- Establish and manage deployment strategies such as canary releases, blue-green deployments, shadow testing, and rollback mechanisms
- Collaborate cross-functionally to transition machine learning models from research to production environments
- Ensure security, compliance, traceability, and access controls across data and ML systems
- Optimize performance, scalability, and cost efficiency across cloud and data platforms
- Document architecture designs, deployment standards, and operational procedures
Qualifications:
Required Technical Skills
- Minimum of 5 years of experience in MLOps, machine learning engineering, platform engineering, or DevOps
- Hands-on experience with AWS, Microsoft Azure, and Snowflake in building or supporting production ML/data platforms
- Strong programming skills in Python and SQL
- Experience deploying and managing machine learning models in production environments
- Experience with cloud ML services such as AWS SageMaker and Azure Machine Learning
- Experience building and integrating data pipelines with Snowflake
- Proficiency with CI/CD pipelines, infrastructure automation, and model versioning
- Experience with containerization and orchestration tools such as Docker and Kubernetes
- Experience with workflow orchestration tools such as Apache Airflow, Azure Data Factory, or similar
- Familiarity with monitoring, logging, alerting, and observability frameworks
- Strong understanding of data engineering concepts, APIs, and distributed systems
- Proven troubleshooting, communication, and cross-functional collaboration skills
Preferred Experience / Qualifications
- Master’s or PhD in Computer Science, Computer Engineering, or a related technical field
- Experience with Snowflake Cortex AI, Snowpark, or machine learning workloads within Snowflake
- Experience with generative AI platforms such as AWS Bedrock or Azure OpenAI
- Experience building real-time inference systems, event-driven architectures, or serverless pipelines
- Familiarity with feature stores, vector databases, and retrieval-augmented generation (RAG) systems
- Experience with infrastructure-as-code tools such as Terraform, AWS CloudFormation, or Azure Resource Manager
- Understanding of security, compliance, and governance frameworks in regulated environments
- Experience implementing A/B testing, shadow deployments, and advanced model release strategies
APS Consulting Footer
Augment Professional Services (APS) delivers specialized talent, consulting expertise, and project support to organizations operating in complex technical environments. Our teams partner with clients across the Technology, Energy, Utilities, and EPC sectors to support critical initiatives in digital transformation, infrastructure modernization, engineering delivery, and industrial construction.
Through a flexible services model that includes managed services, project-based delivery, and embedded technical expertise, APS helps organizations accelerate innovation, scale capabilities, and execute high-impact initiatives with confidence.
Equal Opportunity Statement
Augment Professional Services (APS) is an equal opportunity employer. We are committed to creating an inclusive environment for all employees and applicants and do not discriminate on the basis of race, color, religion, sex, national origin, age, disability, veteran status, or any other protected characteristic in accordance with applicable laws and regulations.
Salary : $65 - $82