What are the responsibilities and job description for the Machine Learning Engineer position at YASH Technologies?
Role: Machine Learning Engineers
Location: St. Louis MO
Type: Contract
This position is ideal for someone eager to apply practical ML and LLM techniques in production, leveraging Databricks, Python, and modern vector database frameworks.
WhatYou’ll Do
- Develop Retrieval & Embedding Pipelines:
- Build and deploy pipelines that transform enterprise documents (Confluence pages, OneDrive files, internal reports) into structured and vectorized data for semantic retrieval.
- Use tools like Databricks MLflow, MosaicML, and LangChain to orchestrate workflows.
- Integrate LLMs with Knowledge Bases:
- Design and implement Retrieval-Augmented Generation (RAG) systems to ground LLM outputs in enterprise data.
- Collaborate with AI agents on Databricks to provide contextualized responses from internal knowledge stores.
Experiment & Optimize Models:
- Evaluate different embedding models, fine-tuning strategies, and retrieval mechanisms for efficiency, scalability, and accuracy.
- Contribute to prompt engineering, model benchmarking, and performance tracking.
- Collaborate Across Disciplines:
- Work closely with Data Engineers on ingestion and cleaning pipelines, and with Software Engineers on API integration and front-end consumption of ML services.
- Operationalize ML Solutions:
- Use MLflow to track experiments, automate deployment pipelines, and ensure reproducibility across environments.
- Contribute to testing, documentation, and continuous improvement of ML infrastructure.
What YouBring
- Solid foundation in machine learning, natural language processing, or applied AI.
- Proficiency in Python and familiarity with frameworks such as PyTorch, TensorFlow, or Hugging Face Transformers.
- Experience with Databricks, MLflow, or MosaicML.
- Familiarity with LangChain, LlamaIndex, or similar RAG frameworks.
- Understanding of vector databases (e.g., Chroma, Milvus, Pinecone, FAISS).
- Experience with API integration and data retrieval from enterprise systems (e.g., Confluence, SharePoint, OneDrive).
- Ability to collaborate in a cross-functional engineering team and communicate complex technical concepts clearly.
Bonus Skills
- Experience fine-tuning or evaluating LLMs (e.g., Llama, MPT, Falcon, or Databricks-hosted models).
- Knowledge of OCR pipelines for document ingestion and Databricks Unity Catalog for managing structured data.
- Background in cloud infrastructure, containerization (Docker), or CI/CD for ML systems.
- Prior work with embedding search optimization, semantic caching, or enterprise AI governance.