Demo

Data Engineer III 70756-1

Jobs via Dice
Menlo Park, CA Full Time
POSTED ON 6/3/2026
AVAILABLE BEFORE 7/2/2026
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Akidev Corporation, is seeking the following. Apply via Dice today!

Request ID: 70756-1

Start/End Dates: 7/13/2026 - 12/31/2026

Tax Work Location: US - CA - Menlo Park (105201)

Job Title: Data Analytics & Engineering - Data Engineer III

Job Description: Summary

Generative AI models are only as good as the data they consume. Unlike traditional data engineering, building data pipelines for generative AI requires orchestrating ML model invocations (content understanding classifiers, embedding models, LLM-based cleaners) alongside standard SQL-based transformations, all at billion-row scale.

This role sits at the intersection of Data Engineering and ML Systems. The Senior AI Data Engineer will own end-to-end data pipelines that don''t just move and transform data, but enrich it through remote model inference, managing the systems complexity of async execution, capacity allocation, retry/fallback logic, and throughput optimization that comes with it. This is not a pure ETL-with-SQL role; it demands hands-on systems experience with distributed inference infrastructure.

Our team develops comprehensive data curation and evaluation solutions for image generation models across quality dimensions including visual quality, prompt adherence, identity preservation, naturalness, and visual text generation.

Job Responsibilities

AI-Augmented Data Pipelines: Design and maintain AI-augmented, large-scale data pipelines (billions of images) integrating traditional transformations with ML models (classifiers, embeddings, LLMs) for cleaning and annotation.

Remote Inference Orchestration: Own the systems for remote ML model inference orchestration within pipelines, managing batching, retries, async jobs, and ensuring graceful degradation.

Feature Pipelines: Build and maintain scalable pipelines for generating, storing, and serving vector embeddings, including nearest-neighbor index management and quality validation.

Data Curation at Scale: Source, filter, and curate training datasets using a combination of SQL and model-derived signals (e.g., aesthetic scores, NSFW classifiers), owning the end-to-end data flow and maintaining governance, quality, and compliance.

Additional Responsibilities

LLM-Assisted Annotation: Design and operate pipelines that use LLMs and vision models for automated annotation of training data, including auditing workflows to measure and improve annotation model performance.

Tooling & Frameworks: Contribute to shared tooling and frameworks that make it easier for the broader team to build AI-augmented data pipelines — e.g., reusable operators for model invocation, standard patterns for async job management.

Skills Required

Advanced SQL & data pipeline expertise. Complex queries, query optimization, pipeline orchestration frameworks (Airflow, Dataswarm, or equivalent).

Experience integrating ML models into data pipelines. Calling inference endpoints, managing model versions, batching requests, handling inference failures at scale.

Proficiency with AI-assisted coding agents (e.g., Copilot, Cursor, Codex). Expected to leverage AI tools as a force multiplier for writing, debugging, and reviewing code, building pipelines faster, and accelerating day-to-day engineering workflows Strong verbal and written communication skills, problem-solving ability, and cross-functional collaboration.

Preferred

Working knowledge of embeddings and vector representations like generating, storing, indexing, and querying embeddings (FAISS, Milvus, or equivalent).

Familiarity with content-understanding models like image classifiers, object detection, OCR, NSFW detection, aesthetic scoring.

Experience with LLMs for data tasks like prompt engineering for annotation, data cleaning, or evaluation using LLM APIs.

Knowledge of generative AI like diffusion models, image generation, evaluation metrics (FID, CLIP score, etc.).

Education / Experience

Bachelor''s degree or higher in Computer Science, Data Engineering, Machine Learning, or a related STEM field.

5 years of industry experience in data engineering, ML engineering, or a hybrid role involving both data pipelines and model serving/inference.

Demonstrated track record of building and operating production data pipelines that invoke ML models at scale.

Salary.com Estimation for Data Engineer III 70756-1 in Menlo Park, CA
$153,753 to $192,813
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Data Engineer III 70756-1?

Sign up to receive alerts about other jobs on the Data Engineer III 70756-1 career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$124,724 - $161,246
Income Estimation: 
$147,901 - $186,323
Income Estimation: 
$122,257 - $154,284
Income Estimation: 
$143,391 - $179,890
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at Jobs via Dice

  • Jobs via Dice St Albans, VT
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Axiom Technologies LLC, is seeking the following. Appl... more
  • 1 Day Ago

  • Jobs via Dice Middletown, RI
  • Job ID: T2600302 Location: Middletown, RI, US Date Posted: 2026-03-05 Category: Engineering and Sciences Subcategory: Electrical Engr Schedule: Full-Time S... more
  • 1 Day Ago

  • Jobs via Dice Providence, RI
  • Role Overview We are seeking a customer-focused Desktop Support Technician to provide hands-on Windows 11 deskside support in a clinical/corporate environm... more
  • 1 Day Ago

  • Jobs via Dice Providence, RI
  • Dice is the leading career destination for tech experts at every stage of their careers. Our client, Cyma Systems Inc, is seeking the following. Apply via ... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Data Engineer III 70756-1 jobs in the Menlo Park, CA area that may be a better fit.

  • Akidev Corporation Menlo Park, CA
  • Request ID: 70756-1 Start/End Dates: 7/13/2026 - 12/31/2026 Tax Work Location: US - CA - Menlo Park (105201) Job Title: Data Analytics & Engineering - Data... more
  • 1 Day Ago

  • itD Menlo Park, CA
  • Data Engineer III itD is seeking a Senior AI Data Engineer III to build and scale AI-augmented data infrastructure that powers next-generation image genera... more
  • 1 Day Ago

AI Assistant is available now!

Feel free to start your new journey!