Demo

LLM Application Development Engineer

MASH Pro Tech
Santa Clara, CA Contractor
POSTED ON 5/30/2026
AVAILABLE BEFORE 6/29/2026
AI Application Engineer - LLM Application Development
Santa Clara - 5 days Onsite
 
NVIDIA-Specific Stack
  • NVIDIA NIM — deployment, inference, API integration, model lifecycle · Advanced · Must-have
  • NeMo framework — model configuration, inference optimization · Proficient · Must-have
  • NeMoGuardrails — rails configuration, content safety models, jailbreak detection, topical control · Advanced · Must-have
  • NVIDIA Riva — ASR / TTS integration in application layer · Proficient · Must-have
  • NVIDIA NIM models — llama-3.1-nemoguard-8b, llama-3.2-nv-rerankqa-1b-v2, llama-3.2-nv-embedqa-1b-v2, nemotron-3-super-120b-a12b familiarity · Proficient · Nice to have

Duties:

  • Will work on the intelligence layer for multiple programs — owns all model quality, RAG accuracy, prompt engineering, and AI safety across applications 
  • Socratic tutor persona, adaptive learning recommendation engine, multi-modal AI (text and voice via NVIDIA Riva), RAG evaluation framework, and feedback loop into retrieval
  • 6-LLM call chain orchestration (NeMoGuardrails → intent classification → query rewriting → RAG → synthesis), Perplexity web search integration, NIM recommendation engine, and compatibility check logic
  • Production-grade AI quality from launch — this is not a research or prototyping role; accuracy thresholds, latency requirements, and safety guardrails must pass InfoSec adversarial testing before Release 1
Required Skills
LLM Application Development
  • LLM prompt engineering — system prompts, few-shot examples, chain-of-thought, instruction following · Expert · Must-have
  • Multi-step LLM chain orchestration — LangChain, LlamaIndex, or custom orchestration · Expert · Must-have
  • Multi-turn conversation design — context window management, conversation summarization, session memory · Advanced · Must-have
  • Streaming LLM response handling — token-by-token streaming, partial response rendering · Advanced · Must-have
  • Model selection and benchmarking — matching model size to task; balancing latency, cost, and accuracy · Advanced · Must-have
RAG Pipeline Design & Quality
  • RAG pipeline design — chunking strategy, embedding model selection, retrieval configuration · Expert · Must-have
  • Vector similarity search tuning — index parameters, similarity thresholds, retrieval depth · Advanced · Must-have
  • Reranking — cross-encoder rerankers, relevance scoring · Advanced · Must-have
  • RAG evaluation frameworks — RAGAS, TruLens, or equivalent; automated eval pipelines · Advanced · Must-have
  • Hybrid search — combining dense vector retrieval with BM25 or keyword search · Proficient · Nice to have
AI Safety & Guardrails
  • Prompt injection detection and mitigation · Advanced · Must-have
  • Jailbreak testing and red-teaming LLM systems · Advanced · Must-have
  • Content safety classifier integration · Advanced · Must-have
  • Hallucination detection and mitigation strategies · Advanced · Must-have
  • Topical control — enforcing scope boundaries on LLM responses · Advanced · Must-have
Evaluation & Production Quality
  • Automated evaluation pipeline design — test set curation, metric selection, regression detection · Advanced · Must-have
  • A/B evaluation methodology for prompt and model changes · Proficient · Must-have
  • Latency profiling for LLM call chains — identifying bottlenecks across multi-step pipelines · Proficient · Must-have
  • Feedback loop design — user signal collection, signal-to-retrieval-weight integration · Proficient · Must-have
  • Production model monitoring — accuracy drift detection, quality degradation alerting · Proficient · Must-have
Development
  • Python — ML/AI application development, async programming · Expert · Must-have
  • API design for AI services — streaming endpoints, error handling, timeout management · Advanced · Must-have
  • Embedding model operations — model selection, batch embedding, index updates · Advanced · Must-have
Nice to Have
  • Adaptive learning systems or personalization engine experience
  • Knowledge graph integration with RAG
  • Multi-agent orchestration patterns 
  • ServiceNow API integration
  • Prior experience building AI products on NVIDIA infrastructure
Experience
  • 11 years of software engineering with at least 2 years focused on LLM application development in production — not research, not demos, not internal tools with 10 users
  • Has shipped an LLM-powered feature or product to production where real users depend on the accuracy and the engineer owns the quality metrics
  • Has owned an AI safety or guardrails implementation for a customer-facing product — not just added an off-the-shelf filter; designed and tested the safety layer
  • Has built RAG evaluation pipelines and used them to make go/no-go release decisions — accuracy gating is part of the workflow.
  • Has profiled and optimized a multi-step LLM call chain for latency
  • Has worked with NVIDIA NIM or NeMo in a real project (strongly 

Salary : $80 - $85

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a LLM Application Development Engineer?

Sign up to receive alerts about other jobs on the LLM Application Development Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,441 - $130,752
Income Estimation: 
$111,369 - $141,168
Income Estimation: 
$117,871 - $153,580
Income Estimation: 
$109,939 - $144,341
Income Estimation: 
$114,500 - $144,633
Income Estimation: 
$86,356 - $101,827
Income Estimation: 
$108,740 - $126,996
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at MASH Pro Tech

  • MASH Pro Tech Albany, NY
  • Job Title: Senior Oracle Database Administrator (Oracle DBA) Location: Albany, NY Position Summary We are seeking a Senior Oracle DBA to support a highly c... more
  • 1 Day Ago


Not the job you're looking for? Here are some other LLM Application Development Engineer jobs in the Santa Clara, CA area that may be a better fit.

  • Candidate Experience site Sunnyvale, CA
  • Job Responsibilities: Architect and implement functions to monitor and filter LLM requests/responses in real time, preventing prompt injection attacks and ... more
  • 2 Months Ago

  • Fortinet Sunnyvale, CA
  • Job Responsibilities: Architect and implement functions to monitor and filter LLM requests/responses in real time, preventing prompt injection attacks and ... more
  • 2 Months Ago

AI Assistant is available now!

Feel free to start your new journey!