What are the responsibilities and job description for the ML Research Engineer, ML Systems, Mid-Level position at Jobright.ai?
Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.
Job Summary:
Scale AI is a leading provider of training and evaluation data for the machine learning lifecycle. They are seeking a Machine Learning Research Engineer to build and optimize their internal distributed framework for large language model training and inference, collaborating with various ML teams and researchers.
Responsibilities:
• Build, profile and optimize our training and inference framework
• Collaborate with ML teams to accelerate their research and development and enable them to develop the next generation of models and data curation
• Research and integrate state-of-the-art technologies to optimize our ML system
Qualifications:
Required:
• Strong excitement about system optimization
• Experience with multi-node LLM training and inference
• Experience with developing large-scale distributed ML systems
• Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc.
• Strong written and verbal communication skills and the ability to operate in a cross functional team environment
Preferred:
• Demonstrated expertise in post-training methods &/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.
Company:
Scale AI provides a data-oriented platform that assists in the development of AI applications. Founded in 2016, the company is headquartered in San Francisco, California, USA, with a team of 501-1000 employees. The company is currently Late Stage. Scale AI has a track record of offering H1B sponsorships.