What are the responsibilities and job description for the AWS Data Engineer position at First Soft Solutions LLC?
We are actively hiring for AWS Data Engineer
The core of the work is centered around data. The project's goal is to "collect, store, and expose data across the Enterprise" using a "Data Lake House solution." The required technologies—S3, Glue, Lake Formation, PySpark—are the fundamental building blocks of a modern data platform on AWS. Candidate need more experience in Devops.
A successful candidate for this role would need to be an expert in:
The core of the work is centered around data. The project's goal is to "collect, store, and expose data across the Enterprise" using a "Data Lake House solution." The required technologies—S3, Glue, Lake Formation, PySpark—are the fundamental building blocks of a modern data platform on AWS. Candidate need more experience in Devops.
A successful candidate for this role would need to be an expert in:
- Data Architecture: Designing and implementing a scalable data lakehouse.
- ETL/ELT: Developing data pipelines to ingest, transform, and load data from various sources. Glue Jobs and PySpark are the primary tools for this.
- Data Governance: Using AWS Lake Formation to secure and manage access to the data, ensuring the "right data to the right customer at the right time."
- Data Cataloging: Creating and maintaining a central repository of metadata using Glue Catalog and Crawlers.
- Data Orchestration: Building and managing automated workflows to run the data pipelines, likely using AWS Step Functions.
- Infrastructure as Code (IaC): Provision and manage the AWS infrastructure (S3, Glue, Lake Formation, etc.) using code. The listed frameworks—CloudFormation, Stacker, and Terraform—are all for IaC.
- CI/CD: Build and maintain automated pipelines using GitLab to deploy the data pipelines and infrastructure. This is critical for the "quicker feedback loop" mentioned.
- Serverless Architecture: Develop and manage serverless components, specifically AWS Lambda functions, which are often used for event-driven data processing or to trigger other services.
- Scripting and Automation: Be proficient in Python for both AWS SDK interactions and general automation tasks.