What are the responsibilities and job description for the Data Architect (Google Cloud Platform) position at TechSpace Solutions Inc.?

Job Title: Data Architect (Google Cloud Platform)

Location: Dallas, TX / Charlotte, NC (Onsite)

Duration: 12 Months

Project/Program:

Identity & Access Management (IAM) Data Modernization

Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases.

The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with Google Cloud Platform as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions

About Program/Project:

The IAM Data Modernization program focuses on transforming legacy data platforms into a scalable and cloud-compatible architecture.

Key Highlights:

Integration Scope: 30 source systems with multiple downstream integrations

Capabilities: Metrics, reporting, advanced analytics, and GenAI use cases (NL querying, summarization, cross-domain insights)

Benefits:

Scalable and resilient data platform
High-performance semantic and analytics layer
Single source of truth for enterprise-wide reporting and analytics

Role Summary:

We are looking for a Data Architect with strong expertise in OpenShift (OCP), PySpark, and CI/CD pipelines to design and govern scalable data platforms.

The role requires defining end-to-end data architecture, containerised deployment patterns, orchestration strategies (Airflow/Autosys), and platform standards, along with hands-on involvement in implementation.

Key Responsibilities:

Data Architecture & Platform Design:

Define enterprise data architecture for IAM data lake and analytics platform
Design scalable, modular, and containerized data pipeline architectures on OCP
Establish data models, schema governance, and data lifecycle strategies
Define best practices for data partitioning, performance optimization, and cost efficiency

OpenShift (OCP) & Platform Engineering:

Architect and govern containerized data workloads on OpenShift (OCP)
Define standards for deployment, scaling, and workload isolation
Collaborate with DevOps teams for platform engineering and infrastructure alignment

Big Data & Processing (PySpark Focus):

Define architecture for PySpark-based batch and near real-time processing pipelines
Provide guidance on distributed processing design, optimisation, and performance tuning
Establish reusable frameworks for ETL/ELT processing

Data Ingestion & Orchestration

Architect data ingestion frameworks (batch, streaming, CDC)
Define orchestration strategies using Airflow / Autosys
Implement standards for retry, backfills, dependency management, and error handling

DevOps / CI-CD:

Define and oversee CI/CD strategy for data and platform deployments
Enable automation of build, test, and deployment processes
Ensure integration of CI/CD pipelines with OCP-based environments

Cloud & Data Platforms:

Provide architecture guidance for Google Cloud Platform-based data platforms (preferred, not mandatory)
Define integration patterns for cloud-native and on-premises hybrid environments
Guide teams on cloud migration strategies and modern data platform adoption

Data Governance, Quality & Observability

Define frameworks for:

Data quality, validation, and lineage
Metadata management and cataloguing
Establish monitoring, logging, alerting, and SLOs for platform reliability
Ensure compliance with data security and audit requirements

Stakeholder Collaboration

Work closely with client architects, IAM teams, and business stakeholders
Translate business requirements into scalable technical architecture
Provide architectural guidance and mentorship to engineering teams

Core Skills (Must Have)

OpenShift (OCP) / Kubernetes-based platforms
PySpark / Spark ecosystem
CI/CD implementation for data platforms
Airflow / Autosys orchestration tools

Solid understanding of:

Data lake architectures (layered models)
ETL/ELT design patterns
Distributed data processing concepts

Data Engineering & Storage:

Data formats: Parquet, ORC, Avro
Partitioning and performance tuning
Large-scale data modelling for analytics

Cloud (Preferred Not Mandatory)

Experience with Google Cloud Platform (Google Cloud Platform) (preferred)
Exposure to services like Big Query, Dataproc, Dataflow, GCS is a plus

Observability & Reliability

Monitoring, logging, alerting frameworks
Dashboards, SLOs, and operational runbooks

Good to Have

Experience with IAM domain / cybersecurity data
Understanding of data security and access control frameworks
Exposure to GenAI-enabled data platforms
Experience in Agile delivery and team leadership

Experience:

10 14 years in Data Architecture / Data Engineering
Strong experience in OCP, PySpark, CI/CD, and orchestration frameworks
Prior experience in data modernization / migration programs
Education: Bachelor s/master's in computer science, Information Systems, or equivalent

Certifications:

OpenShift / Kubernetes certifications
Google Cloud Platform certifications

Apply for this job

Receive alerts for other Data Architect (Google Cloud Platform) job openings

Data Architect (Google Cloud Platform)

What are the responsibilities and job description for the Data Architect (Google Cloud Platform) position at TechSpace Solutions Inc.?

What is the career path for a Data Architect (Google Cloud Platform)?

Job openings at TechSpace Solutions Inc.

Not the job you're looking for? Here are some other Data Architect (Google Cloud Platform) jobs in the Dallas, TX area that may be a better fit.

We don't have any other Data Architect (Google Cloud Platform) jobs in the Dallas, TX area right now.

AI Assistant is available now!