What are the responsibilities and job description for the Data Engineer with AWS Exp-Onsite-Full Time position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Visionary Innovative Technology Solutions, is seeking the following. Apply via Dice today!
Job Title: AWS Data Engineer (Python, PySpark)
Job Type: Full-Time
Role & Responsibilities
Job Title: AWS Data Engineer (Python, PySpark)
Job Type: Full-Time
Role & Responsibilities
- Design and Development: Design, develop, and maintain robust, scalable AWS-based data pipelines for batch and real-time processing.
- ETL Implementation: Implement efficient Extract, Transform, Load (ETL) and ELT processes, ensuring data integrity, quality, and optimal performance using AWS services.
- Big Data Processing: Utilize PySpark and Apache Spark to process large-scale datasets within the AWS ecosystem (e.g., AWS Glue, EMR).
- AWS Services Management: Work extensively with AWS cloud data services, including Amazon S3 (for data lake storage), AWS Glue, EMR, Redshift (data warehousing), Lambda, Athena, and Step Functions.
- Collaboration: Collaborate with cross-functional teams, including data scientists, analysts, and application developers, to understand data requirements and integrate data engineering solutions into broader business applications.
- Optimization and Troubleshooting: Monitor, troubleshoot, and optimize existing data workflows and infrastructure to ensure high availability, scalability, security, and cost-effectiveness.
- Best Practices: Apply software engineering best practices, including version control (Git), documentation, CI/CD pipelines, and infrastructure-as-code (e.g., CloudFormation, Terraform).
- Database Management: Work with both SQL and NoSQL databases, writing complex queries and performing query optimization.
- Experience: Proven hands-on experience in AWS data engineering, with expertise in Python and PySpark.
- Programming: Strong proficiency in Python, PySpark, and SQL.
- AWS Expertise: In-depth experience with the AWS data stack: S3, Glue, EMR, Redshift, Lambda, and Athena.
- Data Fundamentals: Deep understanding of data modeling, data warehousing concepts, data lake architecture, and performance optimization techniques.
- Tools: Familiarity with workflow management/orchestration tools like Apache Airflow.
- Problem Solving: Strong analytical and problem-solving skills.