What are the responsibilities and job description for the Data Engineer with GEN AI position at Smartwork IT Services?
Greetings from Smart Work IT Services
Role: Data Engineer With Gen AI
Location: New York City, NY - Full Time
We are looking for a Data Engineer with strong Generative AI exposure to design, build, and maintain scalable data pipelines and data platforms that power AI/ML and GenAI applications. The ideal candidate should have strong experience in modern data engineering tools along with hands-on Python development and API frameworks.
Key Responsibilities
- Design, build, and maintain scalable data pipelines using modern data engineering tools.
- Develop and manage data transformation workflows using dbt and Dagster.
- Build robust data models and pipelines supporting analytics and AI workloads.
- Develop backend services and APIs using Python and FastAPI to enable data access for AI applications.
- Work with PostgreSQL and SQL to design efficient schemas and optimize queries.
- Process and analyze data using Pandas and Python-based data frameworks.
- Integrate structured and unstructured data pipelines to support Generative AI applications.
- Build AI agents using LangGraph
- Implement RAG pipelines and Text-to-SQL systems
- Integrate AI capabilities into enterprise platforms
- Collaborate with AI/ML engineers to enable LLM-powered applications and data pipelines.
- Implement data quality, monitoring, and performance optimization practices.
- Participate in architecture design discussions and contribute to scalable data platform design.
Required Skills
- Strong programming experience in Python.
- Experience working with Dagster for orchestration.
- Hands-on experience with dbt for data transformation and modeling.
- Advanced SQL skills and experience with PostgreSQL.
- Experience working with Pandas for data processing.
- Experience building APIs using FastAPI.
- Solid understanding of data pipeline design and ETL/ELT processes.
- Familiarity with building data infrastructure for AI/ML or GenAI use cases.
Good to Have
- Experience designing Star Schema / Dimensional Data Models.
- Familiarity with Medallion Architecture (Bronze, Silver, Gold layers).
- Experience with Generative AI frameworks or LLM integrations.
- Knowledge of RAG pipelines, vector databases, or embedding workflows.
- Exposure to cloud platforms such as AWS / Google Cloud Platform / Azure.
Vetting Process
- 1 hour technical discussion (Live coding Python)
- 30 minutes Technical Discussion
- 30 minutes Delivery Connect
- In person interview with Customer at NYC, NY