What are the responsibilities and job description for the Machine Learning Engineer (Computer Vision/Multimodal/Generative AI) position at SPREEAI?
About The Role
We are hiring Machine Learning Engineers who want to work on frontier problems in vision and generative AI where standard solutions break. You will work across photorealistic virtual try-on, video-based modeling, Smart Sizing, and multimodal representation learning. The work spans modern architectures such as diffusion models, transformers, and learned visual representations, with emphasis on controllability, compute efficiency, and production readiness. This role sits at the intersection of applied research and engineering execution.
What You'll Do
We are hiring Machine Learning Engineers who want to work on frontier problems in vision and generative AI where standard solutions break. You will work across photorealistic virtual try-on, video-based modeling, Smart Sizing, and multimodal representation learning. The work spans modern architectures such as diffusion models, transformers, and learned visual representations, with emphasis on controllability, compute efficiency, and production readiness. This role sits at the intersection of applied research and engineering execution.
What You'll Do
- Develop and improve multimodal AI systems involving image, video, and generative pipelines.
- Work on diffusion model optimization, controllability, and step efficiency.
- Design experiments and evaluation frameworks for visual realism and consistency.
- Translate research prototypes into scalable production systems.
- Collaborate closely with infrastructure teams to optimize training and inference.
- Degree in Computer Science, AI, Robotics, or comparable combination of education and practical experience.
- Strong programming skills in Python and familiarity with object-oriented languages (C , Java, or similar).
- Strong data structures and algorithms fundamentals.
- Experience with PyTorch or similar frameworks.
- Familiarity with CNNs, Vision Transformers (ViT), or diffusion architectures.
- Experience with Stable Diffusion, ControlNet, LoRA, or generative pipelines.
- Human pose estimation, geometry-aware modeling, or video understanding.
- Experience shipping ML systems into production.