What are the responsibilities and job description for the Data Lead Engineer Only W2 Candidates position at Mega Cloud Lab?
Location: Fremont, CA (Onsite)
Required
supporting high-volume transactional and regulatory data processing.
Led enterprise-scale Data Architecture initiatives, defining logical and physical data models, governance
standards and cloud-native platform blueprints across AWS and Azure environments.
Designed and implemented Medallion (Bronze/Silver/Gold) Lakehouse architectures using Delta Lake, S3, ADLS
Gen2, Snowflake, Redshift and Synapse Analytics.
Engineered large-scale distributed processing workloads using Apache Spark, PySpark, Databricks, EMR, Hive
and HDFS, processing billions of records for enterprise analytics.
Orchestrated complex data workflows using Apache Airflow, Databricks Workflows, AWS Step Functions and
Azure Data Factory triggers, ensuring SLA-driven pipeline execution.
Strong hands-on experience in Advanced SQL, including complex joins, CTEs, window functions, stored
procedures, indexing strategies, partitioning and execution plan optimization across Snowflake, PostgreSQL
and Oracle.
Built real-time streaming architectures using Apache Kafka, AWS Kinesis, Azure Event Hub and Service Bus,
supporting fraud detection, claims monitoring and operational telemetry.Skills: sql,adf,pyspark,azure
Required
- Azure Data Factory (ADF), Azure Databricks & PySpark, Azure Synapse, Azure SQL, Python, and Spark SQL
- Accomplished Data Architect / Senior Data Engineer with 12 years of experience designing and modernizing enterprise data platforms across banking, healthcare and retail domains
supporting high-volume transactional and regulatory data processing.
Led enterprise-scale Data Architecture initiatives, defining logical and physical data models, governance
standards and cloud-native platform blueprints across AWS and Azure environments.
Designed and implemented Medallion (Bronze/Silver/Gold) Lakehouse architectures using Delta Lake, S3, ADLS
Gen2, Snowflake, Redshift and Synapse Analytics.
Engineered large-scale distributed processing workloads using Apache Spark, PySpark, Databricks, EMR, Hive
and HDFS, processing billions of records for enterprise analytics.
Orchestrated complex data workflows using Apache Airflow, Databricks Workflows, AWS Step Functions and
Azure Data Factory triggers, ensuring SLA-driven pipeline execution.
Strong hands-on experience in Advanced SQL, including complex joins, CTEs, window functions, stored
procedures, indexing strategies, partitioning and execution plan optimization across Snowflake, PostgreSQL
and Oracle.
Built real-time streaming architectures using Apache Kafka, AWS Kinesis, Azure Event Hub and Service Bus,
supporting fraud detection, claims monitoring and operational telemetry.Skills: sql,adf,pyspark,azure