What are the responsibilities and job description for the Databricks Architect position at Nous Infosystems?
Job Details
Role: Databricks Lead/Architect
Location: Denver, CO - Hybrid
Need only locals
Job Overview
We are seeking highly skilled and experienced Hands-on Databricks Architect to design,
build, and optimize our modern data streaming platform. This platform, built on Azure and
leveraging the Databricks Lakehouse, is central to our Data Availability Initiative. It provides
comprehensive data ingestion, real-time streaming, batch processing, analytics, and AI/ML
capabilities. The ideal candidate will be a technical lead with deep expertise in Databricks,
Azure cloud services, and a strong understanding of data architecture principles like the
Medallion Architecture and Data Mesh. This role requires a hands-on approach to
implementing solutions, driving robust data governance through Unity Catalog, and
integrating advanced AI/ML models.
Key Responsibilities
- Lead the architectural design, development, and implementation of scalable and
- robust data solutions on the Databricks Lakehouse Platform within an Azure
- environment.
- Design and implement a robust multi-workspace strategy by manually configuring
- and automating the deployment of Development, QA, and Production Databricks
- environments.
- Standardize environment setup using Terraform, ensuring consistent, repeatable,
- and version-controlled infrastructure across the entire SDLC.
- Architect and implement data ingestion strategies for real-time streaming (using
- Debezium, Apache Kafka, Confluent Platform) and batch processing, leveraging
- Databricks Lake Flow and Auto Loader.
- Design and develop data pipelines following the Medallion Architecture (Lead,
- Bronze, Silver, Gold tiers) using Databricks Delta Live Tables (DLT) and Structured
- Streaming, ensuring data quality, lineage, and governance.
- Optimize Databricks clusters, Delta Lake tables, and Spark jobs for performance,
- scalability, and cost efficiency, including leveraging auto-scaling, spot instances,
- compression, and partitioning techniques.
- Establish and enforce comprehensive data governance, security, and access
- control policies utilizing Databricks Unity Catalog and Azure Purview across multi tenant workspaces and business units.
- Develop and integrate MLOps pipelines using Databricks MLflow and Azure Machine
- Learning for model training, serving, and monitoring, including custom Large
- Language Model (LLM) integrations.
- Collaborate with business stakeholders to translate complex data requirements into
- technical designs and scalable solutions.
- Implement CI/CD practices for Databricks solutions using Databricks Asset
- Bundles, Terraform, GitHub, and GitHub Actions, ensuring automated deployment
- and version control.
- Integrate data from diverse sources including SQL databases (PostgreSQL, SQL
- Server), MongoDB, Snowflake, and various flat/multimedia files into the data
- platform.
- Design and implement API-first data consumption patterns, serverless functions
- (Azure Functions), and application integrations for various data products and real time services.
- Provide technical leadership, mentorship, and best practices to data engineering
- and development teams.
- Contribute to the continuous evolution of the data platform, incorporating future
- considerations such as the Databricks Ingestion Gateway, advanced AI integration,
- and Data Mesh evolution.
Required Skills
Databricks Lakehouse, Apache Spark, Delta Live Tables (DLT), Auto Loader, Microsoft
Azure, ADLS Gen2, Azure Functions, Azure Machine Learning, Apache Kafka, Confluent
Platform, Debezium (CDC), Databricks Unity Catalog, Azure Purview, Python, SQL, Scala
(Advantage), Databricks Asset Bundles, Terraform, GitHub Actions, Medallion Architecture,
Data Mesh, PostgreSQL, MongoDB, Snowflake, MLOps, LLM Integration, Azure Networking
Qualifications
- Bachelor's or Master's degree in Computer Science, Data Engineering, or a related
- quantitative field.
- 8 years of progressive experience in data architecture, data engineering, or a
- related role within an enterprise environment.
- 5 years of hands-on experience specifically with the Databricks Lakehouse
- Platform, including designing and implementing complex data solutions.
- Proven track record of successfully designing and implementing large-scale, high performance, and cost-optimized data platforms on Microsoft Azure. Databricks Certified Data Engineer (Associate or Professional) and/or Azure Data
- Engineer Associate (DP-203) certifications are highly desirable.
- Excellent communication, presentation, and interpersonal skills with the ability to
- influence technical and non-technical stakeholders effectively.
Salary : $70 - $85