What are the responsibilities and job description for the Data Architect position at Incedo Inc.?
Role Overview
We are seeking a Data bricks Data Architect to support the design, implementation, and optimization of cloud-native data platforms built on the Data bricks Lakehouse Architecture. This is a hands-on, engineering-driven role requiring deep experience with Apache Spark, Delta Lake, and scalable data pipeline development, combined with early-stage architectural responsibilities.
The role involves close onsite collaboration with client stakeholders, translating analytical and operational requirements into robust, high-performance data architectures, while adhering to best practices for data modeling, governance, reliability, and cost efficiency.
Key Responsibilities
- Design, develop, and maintain batch and near-real-time data pipelines using Databricks, PySpark, and Spark SQL
- Implement Medallion (Bronze/Silver/Gold) Lakehouse architectures, ensuring proper data quality, lineage, and transformation logic across layers
- Build and manage Delta Lake tables, including schema evolution, ACID transactions, time travel, and optimized data layouts
- Apply performance optimization techniques such as partitioning strategies, Z-Ordering, caching, broadcast joins, and Spark execution tuning
- Support dimensional and analytical data modeling for downstream consumption by BI tools and analytics applications
- Assist in defining data ingestion patterns (batch, incremental loads, CDC, and streaming where applicable)
- Troubleshoot and resolve pipeline failures, data quality issues, and Spark job performance bottlenecks.
Nice-to-Have Skills
- Exposure to Data bricks Unity Catalog, data governance, and access control models
- Experience with Data bricks Workflows, Apache Airflow, or Azure Data Factory for orchestration
- Familiarity with streaming frameworks (Spark Structured Streaming, Kafka) and/or CDC patterns
- Understanding of data quality frameworks, validation checks, and observability concepts
- Experience integrating Data bricks with BI tools such as Power BI, Tableau, or Looker
- Awareness of cost optimization strategies in cloud-based data platforms
- Prior Lifesciences Domain Experience