What are the responsibilities and job description for the AI Operations Platform Consultant position at MSR Technology Group?

Duration: 6 Month Contract

Location: Charlotte, NC or Jersey City, NJ

Schedule: Hybrid (3 Days Onsite / Week)

Type: W2 Only

Job Description

We are seeking an experienced AI Operations Platform Consultant to support the deployment, optimization, and operational management of Large Language Models (LLMs) in a production-grade, mission-critical environment. The ideal candidate has strong hands-on experience with Kubernetes, TensorRT-LLM, Triton Inference Server, and MLOps/LLMOps practices at scale. This role is highly technical, performance-driven, and crucial to the stability and availability of AI inference systems supporting enterprise workloads.

Key Responsibilities

Deploy, manage, and operate containerized AI/LLM services at scale using Kubernetes and OpenShift.
Configure, tune, and optimize LLMs using TensorRT-LLM and deploy inference services using NVIDIA Triton Inference Server.
Manage and support end-to-end MLOps/LLMOps pipelines, ensuring reliable and automated model deployment workflows.
Set up monitoring frameworks for AI inference services, focusing on performance, availability, latency, and throughput.
Troubleshoot and resolve production issues related to LLM deployment, containerized environments, model performance, and load balancing.
Operate mission-critical systems following enterprise standards for incident, event, and change management.
Build and maintain scalable infrastructure supporting high-performance model serving in production.
Deploy models into microservices architectures, ensuring robust API design and production stability.
Configure, optimize, and troubleshoot Triton Inference Server deployments for high-throughput, low-latency inference.
Apply model optimization techniques including quantization, pruning, knowledge distillation, and TensorRT-LLM-based acceleration.

Required Skills & Experience

Hands-on experience running containerized applications at scale on Kubernetes/OpenShift.
Strong expertise with LLM deployment, tuning, and optimization.
Proficiency with TensorRT-LLM and Triton Inference Server in production environments.
Deep knowledge of MLOps/LLMOps pipelines, CI/CD for model deployment, and automated inference workflows.
Experience monitoring, load balancing, and optimizing high-performance inference systems.
Familiarity with enterprise operational practices (incident/change/event management).
Knowledge of model optimization and performance-enhancement techniques.

Apply for this job

Receive alerts for other AI Operations Platform Consultant job openings

AI Operations Platform Consultant

What are the responsibilities and job description for the AI Operations Platform Consultant position at MSR Technology Group?

What is the career path for a AI Operations Platform Consultant?

Job openings at MSR Technology Group

Not the job you're looking for? Here are some other AI Operations Platform Consultant jobs in the Charlotte, NC area that may be a better fit.

We don't have any other AI Operations Platform Consultant jobs in the Charlotte, NC area right now.

AI Assistant is available now!