What are the responsibilities and job description for the Azure Stack AI DevOps Specialist position at VAK Consulting LLC?
Azure Stack AI DevOps Specialist
Location: Chicago, IL
The Azure Stack AI DevOps Specialist designs, implements, and manages CI/CD pipelines for Al and
Machine Learning applications specifically hosted on Azure Stack infrastructure. You ensure that
infrastructure is treated as code (laC) and that Al models are seamlessly deployed, monitored, and
retrained in hybrid cloud environments.
Key Roles & Responsibilities
1. Hybrid Infrastructure Management
Provisioning: Use Terraform or Bicep to automate the setup of Azure Stack Hub or Edge
resources
Scalability: Configure GPU-enabled nodes on Azure Stack to handle intensive Al/ML workloads
Governance: Implement Azure Policy and Role-Based Access Control (RBAC) to maintain security
across on-premises and cloud environments
2. MLOPs & CI/CD Pipelines
Automation: Build end-to-end pipelines using Azure Pipelines or GitHub Actions to automate
model training, testing, and deployment
Model Versioning: Manage model artifacts and datasets to ensure reproducibility of Al results
Edge Deployment: Orchestrate the deployment of Al models to Azure Stack Edge devices using
loT Edge and Kubernetes (AKS)
3. Monitoring and Optimization
Observability: Implement Azure Monitor and Application Insights to track the health of both the
infrastructure and the Al model's performance (e.g., detecting data drift).
Performance Tuning: Optimize resource allocation for containers running Al inference to reduce
latency at the edge.
4.Security & Compliance
DeySecOps: Integrate security scanning into the pipeline to check for vulnerabilities in container
images and Al libraries.
Data Residency: Ensure that Al processing complies with local data residency laws by keeping
sensitive data on the Azure Stack Hub within the local datacenter.
Technical Skill Requirements
Category Key Tools & Skills
Cloud Platforms Azure Stack Hub, Azure Stack Edge, Azure Stack HCI
DevOps Tools Azure DevOps, GitHub Actions, Jenkins
laC & Configuration Terraform, Bicep, ARM Templates, Ansible
Containers Docker, Azure Kubernetes Service (AKS) on Stack
Al/ML Frameworks Azure Machine Learning, ByJorch, TensorFlow, MLflow
Scripting Python (crucial for Al), PowerShell, Bash
Key Differences from a Standard Azure DevOps Role
Connectivity Awareness: You must design systems that can function in disconnected or low-
bandwidth scenarios (common in Azure Stack environments).
Hardware Knowledge: Understanding the physical constraints of Azure Stack Edge (like FPGA or
GPU capabilities) is necessary for optimizing Al models.
MLOps Focus: Unlike standard app deployment, you are managing the lifecycle of a "living" model that
requires constant data feeding and retraining loops
Machine Learning applications specifically hosted on Azure Stack infrastructure. You ensure that
infrastructure is treated as code (laC) and that Al models are seamlessly deployed, monitored, and
retrained in hybrid cloud environments.
Key Roles & Responsibilities
1. Hybrid Infrastructure Management
Provisioning: Use Terraform or Bicep to automate the setup of Azure Stack Hub or Edge
resources
Scalability: Configure GPU-enabled nodes on Azure Stack to handle intensive Al/ML workloads
Governance: Implement Azure Policy and Role-Based Access Control (RBAC) to maintain security
across on-premises and cloud environments
2. MLOPs & CI/CD Pipelines
Automation: Build end-to-end pipelines using Azure Pipelines or GitHub Actions to automate
model training, testing, and deployment
Model Versioning: Manage model artifacts and datasets to ensure reproducibility of Al results
Edge Deployment: Orchestrate the deployment of Al models to Azure Stack Edge devices using
loT Edge and Kubernetes (AKS)
3. Monitoring and Optimization
Observability: Implement Azure Monitor and Application Insights to track the health of both the
infrastructure and the Al model's performance (e.g., detecting data drift).
Performance Tuning: Optimize resource allocation for containers running Al inference to reduce
latency at the edge.
4.Security & Compliance
DeySecOps: Integrate security scanning into the pipeline to check for vulnerabilities in container
images and Al libraries.
Data Residency: Ensure that Al processing complies with local data residency laws by keeping
sensitive data on the Azure Stack Hub within the local datacenter.
Technical Skill Requirements
Category Key Tools & Skills
Cloud Platforms Azure Stack Hub, Azure Stack Edge, Azure Stack HCI
DevOps Tools Azure DevOps, GitHub Actions, Jenkins
laC & Configuration Terraform, Bicep, ARM Templates, Ansible
Containers Docker, Azure Kubernetes Service (AKS) on Stack
Al/ML Frameworks Azure Machine Learning, ByJorch, TensorFlow, MLflow
Scripting Python (crucial for Al), PowerShell, Bash
Key Differences from a Standard Azure DevOps Role
Connectivity Awareness: You must design systems that can function in disconnected or low-
bandwidth scenarios (common in Azure Stack environments).
Hardware Knowledge: Understanding the physical constraints of Azure Stack Edge (like FPGA or
GPU capabilities) is necessary for optimizing Al models.
MLOps Focus: Unlike standard app deployment, you are managing the lifecycle of a "living" model that
requires constant data feeding and retraining loops
Salary : $50 - $60