Demo

Staff Machine Learning Engineer / Principal ML Engineer

SRS Consulting Inc
San Jose, CA Contractor
POSTED ON 12/8/2025 CLOSED ON 12/23/2025

What are the responsibilities and job description for the Staff Machine Learning Engineer / Principal ML Engineer position at SRS Consulting Inc?

Role: Staff Machine Learning Engineer

Location: San Jose, CA (Onsite) Locals

Duration: Long-term


Mode of Interview: Virtual & Final In-person


Why this role exists

We're building privacy‐preserving LLM capabilities that help hardware design teams reason over Verilog/SystemVerilog and RTL artifacts—code generation, refactoring, lint explanation, constraint translation, and spec‐to‐RTL assistance. We're looking for a Staff‐level engineer to technically lead a small, high‐leverage team that fine‐tunes and productizes LLMs for these workflows in a strict enterprise data‐privacy environment.

You don't need to be a Verilog/RTL expert to start; curiosity, drive, and deep LLM craftsmanship matter most. Any HDL/EDA fluency is a strong plus.


What you'll do (Responsibilities)

• Own the technical roadmap for Verilog/RTL‐focused LLM capabilities—from model selection and adaptation to evaluation, deployment, and continuous improvement.

• Lead a hands‐on team of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability.

• Fine‐tune and customize models using state‐of‐the‐art techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDL‐specific evals:

o Compile‐/lint‐/simulate‐based pass rates, pass@k for code generation, constrained decoding to enforce syntax, and "does‐it‐synthesize” checks.

• Design privacy‐first ML pipelines on AWS:

o Training/customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate; SageMaker (or EKS KServe/Triton/DJL) for bespoke training needs.

o Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM least‐privilege, CloudTrail auditing, and Secrets Manager for credentials.

o Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora.

• Stand up dependable model serving: Bedrock model invocation where it fits, and/or low‐latency self‐hosted inference (vLLM/TensorRT‐LLM), autoscaling, and canary/blue‐green rollouts.

• Build an evaluation culture: automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases).

• Partner deeply with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.

• Drive productization: integrate LLMs with internal developer tools (IDEs/plug‐ins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe tool‐use/function‐calling.

• Mentor & uplevel: coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure‐by‐default systems.


What you'll bring (Minimum qualifications)

• 10 years total engineering experience with 5 years in ML/AI or large‐scale distributed systems; 3 years working directly with transformers/LLMs.

• Proven track record shipping LLM‐powered features in production and leading ambiguous, cross‐functional initiatives at Staff level.

• Deep hands‐on skill with PyTorch, Hugging Face Transformers/PEFT/TRL, distributed training (DeepSpeed/FSDP), quantization‐aware fine‐tuning (LoRA/QLoRA), and constrained/grammar‐guided decoding.

• AWS expertise to design and defend secure enterprise deployments, including:

o Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints)

o SageMaker (Training, Inference, Pipelines), S3, EC2/EKS/ECR, VPC/Subnets/Security Groups, IAM, KMS, PrivateLink, CloudWatch/CloudTrail, Step Functions, Batch, Secrets Manager.

• Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C ).

• Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers.


Nice to have (Preferred qualifications)

• Familiarity with Verilog/SystemVerilog/RTL workflows: lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys/Cadence/Mentor).

• Experience integrating static analysis/AST‐aware tokenization for code models or grammar‐constrained decoding.

• RAG at scale over code/specs (vector stores, chunking strategies), tool‐use/function‐calling for code transformation.

• Inference optimization: TensorRT‐LLM, KV‐cache optimization, speculative decoding; throughput/latency trade‐offs at batch and token levels.

• Model governance/safety in the enterprise: model cards, red‐teaming, secure eval data handling; exposure to SOC2/ISO 27001/NIST frameworks.

• Data anonymization, DLP scanning, and code de‐identification to protect IP.


What success looks like

90 days

• Baseline an HDL‐aware eval harness that compiles/simulates; establish secure AWS training & serving environments (VPC‐only, KMS‐backed, no public egress).

• Ship an initial fine‐tuned/customized model with measurable gains vs. base (e.g., X% compile‐pass rate, −Y% lint findings per K LOC generated).

180 days

• Expand customization/training coverage (Bedrock for managed FMs including Anthropic; SageMaker/EKS for bespoke/open models).

• Add constrained decoding retrieval over internal design specs; productionize inference with SLOs (p95 latency, availability) and audited rollout to pilot hardware teams.

12 months

• Demonstrably reduce review/iteration cycles for RTL tasks with clear metrics (defect reduction, time‐to‐lint‐clean, % auto‐fix suggestions accepted), and a stable MLOps path for continuous improvement.


How we work (Security & privacy by design)

• Customer and internal design data remain within private AWS VPCs; access via IAM roles and audited by CloudTrail; all artifacts encrypted with KMS.

• No public internet calls for sensitive workloads; Bedrock access via VPC interface endpoints/PrivateLink with endpoint policies; SageMaker and/or EKS run in private subnets.

• Data pipelines enforce minimization, tagging, retention windows, and reproducibility; DLP scanning and redaction are first‐class steps.

• We produce model cards, data lineage, and evaluation artifacts for every release.


Tech you'll touch

• Modeling: PyTorch, HF Transformers/PEFT/TRL, DeepSpeed/FSDP, vLLM, TensorRT‐LLM

• AWS & MLOps: Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training/Inference/Pipelines), MLflow/W&B, ECR, EKS/KServe/Triton, Step Functions

• Platform/Security: S3 KMS, IAM, VPC/PrivateLink (incl. Bedrock), CloudWatch/CloudTrail, Secrets Manager

• Tooling (nice to have): HDL toolchains for compile/simulate/lint, vector stores (pgvector/OpenSearch), GitHub/GitLab CI

Staff Machine Learning Engineer (ML Platform)
EarnIn -
Palo Alto, CA
Staff Machine Learning Engineer, ML Performance & Optimization
Waymo -
Mountain View, CA
Staff Machine Learning Engineer (Applied ML)
EarnIn -
Mountain View, CA

Hourly Wage Estimation for Staff Machine Learning Engineer / Principal ML Engineer in San Jose, CA
$84.00 to $102.00
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Staff Machine Learning Engineer / Principal ML Engineer?

Sign up to receive alerts about other jobs on the Staff Machine Learning Engineer / Principal ML Engineer career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$101,387 - $124,118
Income Estimation: 
$119,030 - $151,900
Income Estimation: 
$184,796 - $233,226
Income Estimation: 
$179,606 - $233,815
This job has expired.
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at SRS Consulting Inc

  • SRS Consulting Inc Newark, CA
  • 10 years of hands-on experience as an SAP ABAP Technical Developer. Strong exposure to SD/OTC functional processes and ability to understand order-to-cash ... more
  • 13 Days Ago

  • SRS Consulting Inc Newark, CA
  • Role: SAP Technical Project Manager (with Upgrade Experience) Location: Newark, CA (Onsite) Duration: 6 Months Job Summary: We are seeking a highly skilled... more
  • 13 Days Ago

  • SRS Consulting Inc Mountain View, CA
  • Company Description SRS Consulting Inc. is a leading software development company providing the highest quality information technology services and solutio... more
  • 15 Days Ago

  • SRS Consulting Inc Newark, CA
  • Role: SAP Technical Project Manager (with Upgrade Experience) Location: Newark, CA (Onsite) Duration: 6 Months Job Summary: We are seeking a highly skilled... more
  • 4 Days Ago


Not the job you're looking for? Here are some other Staff Machine Learning Engineer / Principal ML Engineer jobs in the San Jose, CA area that may be a better fit.

  • Jobright.ai Cupertino, CA
  • Verified Job On Employer Career Site Job Summary: Apple is seeking an exceptional Principal Machine Learning (ML) Engineer/Researcher to join their premier... more
  • 27 Days Ago

  • Inworld AI Mountain View, CA
  • About Inworld At Inworld, we believe that the benefits of AI should extend beyond business workflows to the applications and experiences that we enjoy ever... more
  • 1 Month Ago

AI Assistant is available now!

Feel free to start your new journey!