What are the responsibilities and job description for the Data Engineer position at capgemini?
We are seeking a highly skilled Data Engineer with expertise in AWS Glue, Snowflake, DBT, and Python to design and develop scalable data pipelines and ETL solutions. The role demands strong proficiency in AWS services, with a focus on Glue, Lambda, Airflow, and data transformation tools like DBT and Snowpark. The ideal candidate will collaborate across teams to deliver clean, reliable, and high-performance data systems.
Key Responsibilities:
- Design, develop, and maintain scalable ETL/data pipelines using AWS Glue, Snowflake, Python, and DBT
- Manage data integration workflows using AWS services and third-party tools like Fivetran and HVR
- Implement scalable data models and transformation pipelines using Snowpark and DBT
- Use Airflow for orchestration and Lambda for serverless automation
- Troubleshoot and resolve data-related issues, ensuring pipeline reliability
- Optimize workflows for performance, cost efficiency, and fault tolerance
- Collaborate with stakeholders to gather data requirements and deliver actionable solutions
- Document processes and maintain up-to-date technical specifications
Skills Summary:
Core Expertise:
ETL architecture, data pipeline development, cloud-based data integration, Snowflake optimization, serverless computing
Languages & Frameworks:
Python, SQL, DBT, Snowpark
Cloud & Containerization:
AWS Glue, Lambda, Airflow, S3, EC2, CloudWatch
Database & Messaging:
Snowflake, SQL Server, Fivetran, HVR
DevOps & CI/CD:
Git, Jenkins, automated job scheduling, CI/CD integration
Other Tools & Technologies:
DBT (Data Build Tool), Apache Airflow, AWS Glue Studio, Databricks (optional), Terraform (nice-to-have for IAC)
Soft Skills:
Collaboration, proactive problem-solving, data quality ownership, technical documentation, stakeholder communication
Required Qualifications:
- Bachelor's degree in Computer Science, Engineering, or equivalent
- 3 years of hands-on experience with AWS Glue and AWS data ecosystem
- Experience developing scalable data pipelines with Snowflake and Python
- Strong knowledge of SQL, data warehousing, and performance tuning
- Proficiency in DBT for data modeling and transformation
- Familiarity with CI/CD, Airflow scheduling, and serverless tools like Lambda
The pay range that the employer in good faith reasonably expects to pay for this position is $39.30/hour - $61.40/hour. Our benefits include medical, dental, vision and retirement benefits. Applications will be accepted on an ongoing basis. Tundra Technical Solutions is among North America’s leading providers of Staffing and Consulting Services. Our success and our clients’ success are built on a foundation of service excellence. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Unincorporated LA County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: client provided property, including hardware (both of which may include data) entrusted to you from theft, loss or damage; return all portable client computer hardware in your possession (including the data contained therein) upon completion of the assignment, and; maintain the confidentiality of client proprietary, confidential, or non-public information. In addition, job duties require access to secure and protected client information technology systems and related data security obligations.
Salary : $39 - $61