Demo

On-prem Platform Engineer

Ampstek
Charlotte, NC Contractor
POSTED ON 6/3/2026
AVAILABLE BEFORE 7/2/2026

Role: On-prem Platform Engineer

Location: Charlotte, NC (Onsite)

Job Type: Long Terms Contract


Job Description:

Key Skills:

Must-Have Skills (Mandatory Keywords)

LLM Inference & Optimization

• vLLM, TensorRT-LLM, Triton Inference Server, SGLang

• Inference optimization techniques:

o Continuous batching

o Speculative decoding

o KV cache / Prefix caching

• Model optimization:

o FP8, AWQ, GPTQ

Distributed & GPU Systems

• Tensor parallelism and large model scaling

• CUDA, NCCL, GPU architecture

• GPU partitioning & optimization (MIG)

Kubernetes & ML Serving

• Kubernetes-based ML serving platforms

• KServe, OpenShift AI

• Helm charts, Operators, platform automation

GPU Orchestration

• Run:AI or similar GPU scheduling/orchestration platforms

• Multi-tenant GPU workload management

Platform Engineering

• Experience building internal AI/ML platforms (on-prem or hybrid)

• Strong automation and system design mindset

Observability & Performance

• Prometheus, Grafana

• ML observability (model latency, throughput, drift, resource utilization)

• Performance benchmarking and tuning

Good to Have / Preferred Skills:

• Experience with LLMOps / GenAI pipelines

• Exposure to hybrid cloud (on-prem GCP/Azure integration)

• Familiarity with Inferentia / alternative accelerators

• Knowledge of service mesh / networking in GPU clusters

• Build, configure, and operate on prem Kubernetes/OpenShift AI platforms for deploying and serving GenAI models and LLM inference workloads.

• Design and optimize high performance inference stacks using vLLM, TensorRT LLM, Triton Inference Server, SGLang, and advanced techniques (continuous batching, speculative decoding, KV caching).

• Manage GPU orchestration and capacity using Run:AI, MIG, CUDA/NCCL, and tensor parallelism to maximize utilization and throughput.

• Deploy and operate Kubernetes ML serving frameworks (KServe, Helm, Operators) for scalable, reliable model serving.

• Drive inference optimization and benchmarking, leveraging FP8, AWQ, GPTQ, and performance tools such as GuideLLM and Locust.

• Implement observability and ML monitoring using Prometheus, Grafana, Arize AI, ensuring SLA/SLO compliance for GenAI services.

• Collaborate with ML and research teams to onboard new models, tune inference performance, and productionize GenAI use cases.


Thanks and regards,


Deepa Maurya | Technical Recruiter - US Staffing

Email: deepa.m@ampstek.com | Desk: (609) 527-8971

Ampstek LLC – Global IT Partner | www.ampstek.com

Hourly Wage Estimation for On-prem Platform Engineer in Charlotte, NC
$42.00 to $55.00
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a On-prem Platform Engineer?

Sign up to receive alerts about other jobs on the On-prem Platform Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,369 - $122,605
Income Estimation: 
$117,024 - $149,811
Income Estimation: 
$77,900 - $95,589
Income Estimation: 
$101,387 - $124,118
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Ampstek

  • Ampstek Jersey, NJ
  • Job title:- Senior Java developer Location: Irving, TX, Tampa, FL, New Jersey (Onsite) Rate: $50W2(without Benefits) Must Have Skills (Top 3 technical skil... more
  • Just Posted

  • Ampstek Charlotte, NC
  • Job Title: Business Analyst Location:Charlotte, NC 28273(Onsite) Rate: 55/hr on W2(Without benefits) Detailed Job Description The Supply Chain Analyst serv... more
  • Just Posted

  • Ampstek Charlotte, NC
  • Hi , Hope you are doing great! We have the below urgent position with my client. Please reply if you are interested. Role : On-prem Platform Engineer Locat... more
  • Just Posted

  • Ampstek Richardson, TX
  • Position: Workday Techno Functional Consultant || Only US Citizen and Green Card Required Location: Richardson, TX 75082 (2 days onsite per week) Duration:... more
  • Just Posted


Not the job you're looking for? Here are some other On-prem Platform Engineer jobs in the Charlotte, NC area that may be a better fit.

  • CogniSoft Technologies Charlotte, NC
  • vLLM, TensorRT-LLM, Triton Inference Server, SGLang Inference optimization techniques: Continuous batching Speculative decoding KV cache / Prefix caching M... more
  • 1 Day Ago

  • TekGlobal Charlotte, NC
  • Role :: On-prem Platform Engineer Location: Charlotte, NC Key Skills: Must-Have Skills (Mandatory Keywords) LLM Inference & Optimization vLLM, TensorRT-LLM... more
  • Just Posted

AI Assistant is available now!

Feel free to start your new journey!