What are the responsibilities and job description for the Lead Data Engineer - AWS position at Vika Talent Solutions?
Key Responsibilities
- Define and own conceptual, logical, and physical data models, along with standards for metadata, lineage, retention, and data quality.
- Design and manage Golden Records (e.g., Customer, Leads) within an MDM framework, including survivorship and enrichment rules.
- Architect and guide identity resolution strategies (deterministic and probabilistic matching, deduplication).
- Design scalable AWS-based data architectures using services such as Glue, Lambda, Step Functions, S3, Athena, Redshift, and EMR/Spark.
- Build and optimize real-time and event-driven pipelines using Kinesis, SNS, and SQS.
- Establish best practices for performance, cost optimization, reliability, and scalability.
- Implement and oversee CI/CD pipelines and Infrastructure as Code (IaC) using CloudFormation or Terraform.
- Develop curated data layers and semantic models to support self-service analytics and BI tools (Tableau, Power BI). 66
- Establish enterprise data quality frameworks, including validation, profiling, monitoring, and SLAs.
- Collaborate with cross-functional teams to ensure data governance, privacy, and security compliance.
- Translate business requirements into scalable architecture and technical solutions.
- Lead troubleshooting and optimization of production data systems; promote SRE and observability practices.
- Document architecture standards and mentor engineers on data architecture best practices.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, Data Science, or related field (or equivalent experience).
- Strong experience in enterprise data architecture, including MDM, CDP, identity resolution, and data quality frameworks.
- Hands-on expertise with AWS data services (Glue, Lambda, Step Functions, S3, Athena, Redshift, EMR/Spark).
- Advanced SQL skills (PostgreSQL preferred).
- Proficiency in Python, Spark, or C# for data engineering.
- Experience designing and implementing ETL/ELT pipelines and distributed data systems.
- Experience supporting BI and analytics platforms such as Tableau or Power BI.
- Familiarity with data governance tools (Glue Data Catalog, Collibra, Alation).
- Experience with CI/CD and IaC tools (CloudFormation, Terraform).
- Strong collaboration and communication skills across technical and business teams.