What are the responsibilities and job description for the ETL Developer position at Metalight Solutions Inc?
Position Description:
- Responsible for designing, building, and maintaining data pipelines and infrastructure to support data-driven decisions and analytics.
- Design, develop and maintain data pipelines, and extract, transform, load (ETL) processes to collect, process and store structured and unstructured data
- Build data architecture and storage solutions, including data lakehouses, data lakes, data warehouse, and data marts to support analytics and reporting
- Develop data reliability, efficiency, and qualify checks and processes
- Prepare data for data modeling
- Monitor and optimize data architecture and data processing systems
- Collaboration with multiple teams to understand requirements and objectives
- Administer testing and troubleshooting related to performance, reliability, and scalability
- Create and update documentation
Role and Responsibilities:
- Design and implement robust, scalable data models to support the application, analytics and business intelligence initiatives.
- Optimize data warehousing solutions and manage data migrations in the AWS ecosystem, utilizing Amazon Redshift, RDS, and DocumentDB services.
- Develop and maintain scalable ETL pipelines using AWS Glue and other AWS services to enhance data collection, integration, and aggregation.
- Ensure data integrity and timeliness in the data pipeline, troubleshooting any issues that arise during data processing.
- Integrate data from various sources using AWS technologies, ensuring seamless data flow across systems.
- Collaborate with stakeholders to define data ingestion requirements and implement solutions to meet business needs.
- Monitor, tune, and manage database performance to ensure efficient data loads and queries.
- Implement best practices for data management within AWS to optimize storage and computing costs.
- Ensure all data practices comply with regulatory requirements and department policies.
- Implement and maintain security measures to protect data within AWS services.
- Lead and mentor junior data engineers and team members on AWS best practices and technical challenges.
- Collaborate with UI/API team, business analysts, and other stakeholders to support data-driven decision-making.
- Explore and adopt new technologies within the AWS cloud to enhance the capabilities of the data platform.
- Continuously improve existing systems by analyzing business needs and technology trends.
General Experience: The proposed candidate must have a minimum of three (3) years of experience as a data engineer.
Specialized Experience:
- The candidate should have experience as data engineer or similar role with a strong understanding of data architecture and ETL processes. The candidate should be proficient in programming languages for data processing and knowledgeable of distributed computing and parallel processing.
- Minimum 5 years ETL coding experience
- Proficiency in programming languages such as Python and SQL for data processing and automation
- Experience with distributed computing frameworks like Apache Spark or similar technologies
- Experience with AWS data environment, primarily Glue, S3, DocumentDB, Redshift, RDS, Athena, etc.
- Experience with data warehouses/RDBMS like Redshift and NoSQL data stores such as DocumentDB, DynamoDB, OpenSearch, etc
- Experience in building data lakes using AWS Lake Formation
- Experience with workflow orchestration and scheduling tools like AWS Step Functions, AWS MWAA, etc..
- Strong understanding of relational databases (including tables, views, indexes, table spaces)
- Experience with source control tools such as GitHub and related CI/CD processes
- Ability to analyze a company s data needs
- Strong problem-solving skills
- Experience with the SDLC and Agile methodologies