What are the responsibilities and job description for the Principal Data Architect position at Apex IT Services?

Principal Data Architect

Bethesda, MD (3 days in office)

Long term

Qualifications

20 years in data architecture/engineering, leading large-scale migrations (hundreds of millions to billions of records) from legacy/mainframe sources (e.g., DB2) to cloud/SaaS targets.
Strong SQL PostgreSQL (staging/ETL persistence): schema design, partitioning, bulk loading, tuning, and operational rigor (HA/DR/backup, auditability, reconciliation).
Delta/incremental load design (CDC/watermarking, micro-batching), with idempotent processing and replay/backfill strategies.
Salesforce data loading at scale: object modeling, external IDs, load sequencing, Bulk API patterns, and governor/locking constraints.
Data quality, de-duplication, and survivorship frameworks for customer/transaction domains; strong validation and reconciliation patterns.
Cloud experience on AWS supporting large-scale data platforms/migration factories (e.g., S3, RDS/Aurora PostgreSQL, Glue, EMR/Spark, Lambda/Step Functions, IAM/KMS, CloudWatch).

Large-Volume Source-to-Target Data Validation (At Scale)

Define end-to-end reconciliation (DB2 -> PostgreSQL -> Salesforce): count/control totals (e.g., financial/points), tolerances, and sign-off thresholds.
Implement scalable integrity checks: partition-level hashing/checksums, stratified sampling, and distribution/edge-case validation.
Operationalize quality and auditability: rule-based validations, automated quarantine/exception workflows, lineage (batch/run IDs, manifests, watermarks), and dashboarded reporting.
Account for Salesforce constraints: external ID unique/idempotency, lock/contention monitoring, and relationship verification post-load.

Apply for this job

Receive alerts for other Principal Data Architect job openings

Principal Data Architect