What are the responsibilities and job description for the LLM Application Development Engineer position at MASH Pro Tech?

AI Application Engineer - LLM Application Development

Santa Clara - 5 days Onsite

NVIDIA-Specific Stack

NVIDIA NIM — deployment, inference, API integration, model lifecycle · Advanced · Must-have
NeMo framework — model configuration, inference optimization · Proficient · Must-have
NeMoGuardrails — rails configuration, content safety models, jailbreak detection, topical control · Advanced · Must-have
NVIDIA Riva — ASR / TTS integration in application layer · Proficient · Must-have
NVIDIA NIM models — llama-3.1-nemoguard-8b, llama-3.2-nv-rerankqa-1b-v2, llama-3.2-nv-embedqa-1b-v2, nemotron-3-super-120b-a12b familiarity · Proficient · Nice to have

Duties:

Will work on the intelligence layer for multiple programs — owns all model quality, RAG accuracy, prompt engineering, and AI safety across applications
Socratic tutor persona, adaptive learning recommendation engine, multi-modal AI (text and voice via NVIDIA Riva), RAG evaluation framework, and feedback loop into retrieval
6-LLM call chain orchestration (NeMoGuardrails → intent classification → query rewriting → RAG → synthesis), Perplexity web search integration, NIM recommendation engine, and compatibility check logic
Production-grade AI quality from launch — this is not a research or prototyping role; accuracy thresholds, latency requirements, and safety guardrails must pass InfoSec adversarial testing before Release 1

Required Skills

LLM Application Development

LLM prompt engineering — system prompts, few-shot examples, chain-of-thought, instruction following · Expert · Must-have
Multi-step LLM chain orchestration — LangChain, LlamaIndex, or custom orchestration · Expert · Must-have
Multi-turn conversation design — context window management, conversation summarization, session memory · Advanced · Must-have
Streaming LLM response handling — token-by-token streaming, partial response rendering · Advanced · Must-have
Model selection and benchmarking — matching model size to task; balancing latency, cost, and accuracy · Advanced · Must-have

RAG Pipeline Design & Quality

RAG pipeline design — chunking strategy, embedding model selection, retrieval configuration · Expert · Must-have
Vector similarity search tuning — index parameters, similarity thresholds, retrieval depth · Advanced · Must-have
Reranking — cross-encoder rerankers, relevance scoring · Advanced · Must-have
RAG evaluation frameworks — RAGAS, TruLens, or equivalent; automated eval pipelines · Advanced · Must-have
Hybrid search — combining dense vector retrieval with BM25 or keyword search · Proficient · Nice to have

AI Safety & Guardrails

Prompt injection detection and mitigation · Advanced · Must-have
Jailbreak testing and red-teaming LLM systems · Advanced · Must-have
Content safety classifier integration · Advanced · Must-have
Hallucination detection and mitigation strategies · Advanced · Must-have
Topical control — enforcing scope boundaries on LLM responses · Advanced · Must-have

Evaluation & Production Quality

Automated evaluation pipeline design — test set curation, metric selection, regression detection · Advanced · Must-have
A/B evaluation methodology for prompt and model changes · Proficient · Must-have
Latency profiling for LLM call chains — identifying bottlenecks across multi-step pipelines · Proficient · Must-have
Feedback loop design — user signal collection, signal-to-retrieval-weight integration · Proficient · Must-have
Production model monitoring — accuracy drift detection, quality degradation alerting · Proficient · Must-have

Development

Python — ML/AI application development, async programming · Expert · Must-have
API design for AI services — streaming endpoints, error handling, timeout management · Advanced · Must-have
Embedding model operations — model selection, batch embedding, index updates · Advanced · Must-have

Nice to Have

Experience

11 years of software engineering with at least 2 years focused on LLM application development in production — not research, not demos, not internal tools with 10 users
Has shipped an LLM-powered feature or product to production where real users depend on the accuracy and the engineer owns the quality metrics
Has owned an AI safety or guardrails implementation for a customer-facing product — not just added an off-the-shelf filter; designed and tested the safety layer
Has built RAG evaluation pipelines and used them to make go/no-go release decisions — accuracy gating is part of the workflow.
Has profiled and optimized a multi-step LLM call chain for latency
Has worked with NVIDIA NIM or NeMo in a real project (strongly

Salary : $80 - $85

Apply for this job

Receive alerts for other LLM Application Development Engineer job openings

LLM Application Development Engineer