What are the responsibilities and job description for the AI Engineer position at Relanto, Inc.?
Role: AI Engineer
Duration: 6 months
Location: Bay Area, CA (hybrid)
Technology: Exp in Google Cloud Platform (Gemini, Vertex)
Responsibilities:
- Design and implement end-to-end ML pipelines on Google Cloud Platform (Google Cloud Platform)
- Build, fine-tune, and optimize AI/ML models for production deployment using Vertex AI and Gemini models
- Develop Generative AI solutions leveraging Gemini APIs, prompt engineering, Retrieval-Augmented Generation (RAG), and multimodal AI capabilities
- Develop and maintain RESTful APIs for ML and GenAI model serving using FastAPI/Flask
- Implement vector search and semantic retrieval capabilities using BigQuery ML, Vertex AI Matching Engine, and embeddings
- Create automated testing, validation, CI/CD, and deployment pipelines for ML/AI workflows
- Set up model monitoring, observability, drift detection, and performance tracking for production AI systems
- Optimize model inference, scalability, latency, and serving infrastructure on Google Cloud Platform
- Collaborate with data engineering, product, and business teams to deliver scalable AI-driven applications
- Work with containerized deployments using Docker and Google Kubernetes Engine (GKE)
Required Skills
- Must have minimum 4 years of relevant experience
- Strong Python programming skills with ML frameworks such as PyTorch and TensorFlow
- Hands-on experience with Large Language Models (LLMs), Generative AI, and prompt engineering
- Experience working with Google Gemini models and Gemini APIs for GenAI use cases
- Strong proficiency in Google Cloud Platform AI services including Vertex AI, Gemini, Cloud ML Engine, BigQuery ML, and Vertex AI Matching Engine
- Experience implementing vector databases/search, embeddings, semantic search, and RAG architectures
- Expertise in RESTful API development using FastAPI or Flask
- Experience with Docker, Kubernetes, and Google Kubernetes Engine (GKE)
- Strong understanding of CI/CD pipelines and MLOps workflows for ML/AI deployments
- Experience with model monitoring, performance optimization, and scalable AI serving architecture