What are the responsibilities and job description for the Senior Data Architect position at Net2Source (N2S)?
Title- Data Architect
Location: Melville, New York, United States
Mandatory Skills:
- Experience architecting solutions in Cloud Data Engineering platforms (GCP, AWS, Azure)
- Design end-to-end data architectures on GCP leveraging BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Composer, Cloud Storage, and Looker
- Expertise in modern data lake, data warehouse, and lakehouse architectures
- Strong experience with ETL/ELT pipelines using Apache Beam, Spark, and orchestration frameworks
- BigQuery optimization including partitioning, clustering, materialized views, BI Engine, and storage optimization
- CI/CD implementation using Cloud Build, GitHub/GitLab pipelines, and Terraform
- Data governance, security, IAM, compliance, and cloud modernization expertise
Role Description / Responsibilities:
Client Management Experience
- Work closely with client enterprise architects and stakeholders
- Act as a mentor to offshore technical data analytics teams
- Provide architectural leadership and strategic direction for enterprise data initiatives
Data Architecture Experience
- Architect solutions on Cloud Data Engineering platforms such as GCP, AWS, and Azure
- Design end-to-end data architectures on GCP leveraging:
- BigQuery
- Dataflow
- Dataproc
- Pub/Sub
- Cloud Composer
- Cloud Storage
- Looker
- Develop modern data lake, data warehouse, and lakehouse architectures using industry best practices and GCP well-architected frameworks
- Preferred experience with AWS, Azure, and Databricks ecosystems
- Create logical and physical data models, integration patterns, reference architectures, and data flow diagrams
- Architect scalable ETL/ELT pipelines using:
- Dataflow (Apache Beam)
- Dataproc (Spark)
- Cloud Composer orchestrations
- Lead cloud-native modernization initiatives and migration from legacy platforms to GCP
Data Engineering & Development Oversight
- Guide data engineering teams on standards, reusable frameworks, and best practices
- Optimize BigQuery performance using:
- Partitioning
- Clustering
- Materialized views
- BI Engine
- Storage optimization techniques
- Implement data quality, observability, metadata management, and lineage frameworks using Dataplex and Data Catalog
- Ensure adoption of CI/CD pipelines and Infrastructure-as-Code using:
- Cloud Build
- GitHub/GitLab
- Terraform
Data Governance & Security
- Define and enforce data governance standards including:
- Access models
- Encryption
- Retention policies
- Data lifecycle management
- Implement secure cloud practices using:
- IAM policies
- VPC Service Controls
- Organizational policy constraints
- Secure data sharing patterns
- Ensure compliance with GDPR, HIPAA, PCI, and internal corporate policies