What are the responsibilities and job description for the GenAI Data Automation Engineer position at Aptonet?
Title: GenAI Data Automation Engineer
Location: Remote (Gaithersburg, MD/DC area)
Rate: 50 USD hr W2
U.S. Citizenship required
Role Summary
Seeking a GenAI Data Automation Engineer to design and implement AI-driven automation solutions across AWS and Azure hybrid environments. This role supports mission-critical analytics, reporting, and customer engagement platforms through scalable data pipelines, cloud services, enterprise tools, and Generative AI.
Position Type: Remote with quarterly travel to Gaithersburg, MD / DC area
Work Location: REMOTE — Quarterly travel to Gaithersburg, MD for team activities
Program: Federal Trade Commission SNS
Key Responsibilities
- Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions.
- Develop ETL/ELT processes across DynamoDB → SQL Server (AWS) and AWS ↔ Azure SQL systems.
- Integrate AWS Connect and Nice inContact CRM data into enterprise data pipelines for analytics and reporting.
- Engineer and enhance ingestion pipelines with Apache Spark, Flume, and Kafka for real-time and batch processing into Apache Solr and AWS Open Search.
- Create automated processes for vector generation and embedding from unstructured data to support Generative AI models.
- Automate data quality checks, metadata tagging, and lineage tracking.
- Enhance ingestion/ETL with LLM-assisted transformation and anomaly detection.
- Build conversational BI interfaces that allow natural language access to Solr and SQL data.
- Develop AI-powered copilots for pipeline monitoring and automated troubleshooting.
- Implement SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning.
- Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps.
- Ensure security and compliance through IAM, KMS encryption, VPC isolation, RBAC, and firewalls.
- Support Agile DevOps processes with sprint-based delivery of pipeline and AI-enabled features.
Required Technical Skills
- LLM and Generative AI frameworks using AWS Bedrock, Azure OpenAI, or open source platforms.
- AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, and LangChain.
- SQL, SSIS, Python, Spark, Bash, Power shell, AWS/Azure CLIs.
- AWS services including S3, RDS/SQL Server, Glue, Lambda, EMR, and DynamoDB.
- Apache Flume, Kafka, and Solr.
- REST API integration in data pipelines and workflows.
- JIRA, GitHub, Azure DevOps, and Jenkins.
- SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning.
- GenAI Ops pipelines, including model deployment, monitoring, retraining, and lifecycle management for LLMs and AI-enabled data workflows.
- IAM, KMS encryption, VPC isolation, RBAC, and firewalls.
Qualifications & Experience
- BS in Computer Science or related field.
- 2 years of data engineering and automation experience.
- Strong troubleshooting and performance optimization skills in SQL, Spark, or other data engineering solutions.
- Good communication and presentation skills.
- U.S. Citizenship required.
- Ability to obtain Public Trust clearance.
- Start prior to clearance completion is not permissible.
- Standard work hours; Eastern or Central timezone candidate preferred.