What are the responsibilities and job description for the Databricks and AWS Focused Data Engineer position at Largeton Group?
Job Summary (List Format): Databricks and AWS Data Engineer (Contract)
- Position: Data Engineer (Databricks and AWS Focus)
- Location: Onsite, Columbus, OH
- Duration: 3 Months (Contract)
- Develop and maintain scalable data pipelines using PySpark/Spark on Databricks.
- Implement medallion architecture (raw, trusted, refined layers) for data processing.
- Integrate streaming (Kafka) and batch data sources, including APIs.
- Model/register datasets in enterprise data catalogs to ensure governance and accessibility.
- Manage secure, role-based access controls for analytics, AI, and ML use cases.
- Collaborate with team members to deliver high-quality, well-tested code.
- Optimize and operationalize Spark jobs and Delta Lake performance on AWS.
- Implement data quality checks, validations, and CI/CD for Databricks workflows.
- Provision/manage Databricks and AWS resources using Terraform (IaC).
- Set up monitoring/logging/alerts (CloudWatch, Datadog, Databricks audit logs).
- Produce technical documentation, runbooks, and data lineage.
- 6-9 years of expert-level Databricks experience.
- 6-9 years of advanced hands-on PySpark/Spark experience.
- 6-9 years with AWS, S3, and Terraform (IaC).
- Strong knowledge of medallion architecture and data warehousing best practices.
- Experience building, optimizing, and governing enterprise data pipelines.
- Expertise in Delta Lake internals, time travel, schema enforcement, and Unity Catalog RBAC/ABAC.
- Hands-on experience with Spark Structured Streaming, Kafka, and late-arriving data handling.
- Familiarity with Git-based workflows and CI/CD (Databricks Repos, dbx, GitHub Actions, etc.).
- Experience with security/compliance: IAM, encryption, secrets management, PII governance.
- Proven ability to tune Spark jobs and optimize Databricks/AWS usage for performance and cost.
- Experience working in Agile/Scrum teams and code review processes.
- Certifications: Databricks Data Engineer Professional, AWS Solutions Architect/Developer, Terraform Associate.
- Experience with enterprise data catalogs (Collibra, Alation) and data lineage tools (OpenLineage).
- Experience with orchestration tools: Databricks Workflows, Airflow.
- Additional AWS services: Glue, Lambda, Step Functions, CloudWatch, Secrets Manager.
- Experience with testing frameworks: pytest, chispa, Great Expectations, dbx test.
- Background in analytics/ML pipelines and MLOps integrations.