What are the responsibilities and job description for the Machine Learning Engineer position at Scale.jobs?
About The Role
The role focuses on the end-to-end development and scaling of machine learning systems that power core product features. This involves transitioning beyond research notebooks to build resilient, low-latency infrastructure that serves high-volume inference requests in real-time.
The team operates at the intersection of data science and systems engineering, ensuring that models remain performant, reproducible, and observable. This position is critical for bridging the gap between experimental model training and enterprise-grade deployment.
Key Responsibilities
The role focuses on the end-to-end development and scaling of machine learning systems that power core product features. This involves transitioning beyond research notebooks to build resilient, low-latency infrastructure that serves high-volume inference requests in real-time.
The team operates at the intersection of data science and systems engineering, ensuring that models remain performant, reproducible, and observable. This position is critical for bridging the gap between experimental model training and enterprise-grade deployment.
Key Responsibilities
- Design and implement production-grade ML pipelines using Python, PyTorch, and Kubernetes to support automated training and inference
- Develop and maintain feature stores and data processing workflows using Spark or Flink to ensure consistency between training and serving environments
- Optimize model latency and throughput through techniques such as quantization, pruning, and efficient GPU resource allocation
- Build robust MLOps monitoring systems to detect feature drift and model performance degradation using Prometheus, Grafana, or specialized ML observability tools
- Architect scalable vector search and retrieval systems to support advanced RAG (Retrieval-Augmented Generation) and recommendation engine workloads
- Collaborate with backend engineers to integrate ML services into microservices architectures via gRPC or FastAPI
- Implement automated testing and CI/CD pipelines specifically tailored for ML artifacts and model weights
- 3–6 years of experience as a Machine Learning Engineer or Software Engineer with a heavy focus on production ML systems
- Expert-level proficiency in Python and deep experience with deep learning frameworks such as PyTorch or JAX
- Hands-on experience with containerization (Docker) and orchestration (Kubernetes) in a cloud environment (AWS, GCP, or Azure)
- Strong understanding of distributed systems, data structures, and the mathematical foundations of machine learning
- Proven track record of deploying and monitoring models that handle significant traffic in a production setting
- Bachelor’s or Master’s degree in Computer Science, Mathematics, or a related quantitative field
- Bonus: Experience with NVIDIA Triton Inference Server, Ray, or contributing to open-source ML libraries