What are the responsibilities and job description for the Data Engineer position at Prometheus Federal Services (PFS)?
Position Summary
Prometheus Federal Services (PFS) is a trusted partner of federal health agencies. We are exploring the addition of an experienced Data Engineer to support the Department of Veterans Affairs (VA) programs. The ideal candidate brings a deep expertise in SQL, Spark, Databricks, Azure cloud analytics, and database administration (DBA) best practices, including performance tuning, indexing strategies, and data integrity management. This role will design, optimize, and maintain scalable data pipelines and databases across on-prem and cloud environments to support analytics, reporting, and operational decision-making across VA programs. The Data Engineer will collaborate with developers, data scientists, and program stakeholders to deliver reliable, high-performance data and analytics products while improving data accessibility, quality, and operational efficiency.
Essential Duties and Responsibilities
Prometheus Federal Services (PFS) is a trusted partner of federal health agencies. We are exploring the addition of an experienced Data Engineer to support the Department of Veterans Affairs (VA) programs. The ideal candidate brings a deep expertise in SQL, Spark, Databricks, Azure cloud analytics, and database administration (DBA) best practices, including performance tuning, indexing strategies, and data integrity management. This role will design, optimize, and maintain scalable data pipelines and databases across on-prem and cloud environments to support analytics, reporting, and operational decision-making across VA programs. The Data Engineer will collaborate with developers, data scientists, and program stakeholders to deliver reliable, high-performance data and analytics products while improving data accessibility, quality, and operational efficiency.
Essential Duties and Responsibilities
- Design, build, deploy, and maintain scalable ETL/ELT pipelines using SQL, Spark, Databricks, and Azure-native tools such as Azure Data Factory, Azure Synapse, and Azure Data Lake
- Implement workflow orchestration, scheduling, and monitoring to ensure reliable, automated data delivery
- Design, develop, and deliver data quality monitoring and observability tools to track data lineage and manage data integrity across data pipelines
- Optimize data ingestion, transformation, and storage for performance, cost, and maintainability
- Administer and optimize relational databases (e.g., SQL Server, Azure SQL, PostgreSQL), perform database performance tuning, indexing, query optimization, and capacity planning
- Implement and maintain database security, access controls, and auditing in alignment with VA security and privacy requirements
- Support backup/restore strategies, disaster recovery planning, and high‑availability configurations
- Monitor database health, troubleshoot issues, and ensure data integrity across environments. Support troubleshooting, root-cause analysis, and continuous improvement of data engineering and DBA processes
- Implement automated data quality checks, validation rules, and anomaly detection routines
- Develop monitoring dashboards and alerting mechanisms to proactively identify data issues
- Support data profiling, lineage tracking, and metadata management to improve transparency and trust in data assets
- Collaborate with stakeholders to define quality thresholds and acceptance criteria, translating VA stakeholder needs into scalable, repeatable data engineering and database solutions
- Bachelor’s degree
- 5 years of experience in data engineering, ETL/ELT development, or database administration
- Strong proficiency in SQL, Python, Pyspark, Spark, and Databricks for data ingestion, cleaning, and transformation, including performance tuning, indexing, and query optimization
- Experience administering and optimizing relational databases in production environments
- Experience working across VA platforms and enterprise data assets
- Experience implementing automated data pipelines and data quality monitoring frameworks
- Experience with Azure analytics services (Azure Data Factory, Synapse, Azure Delta Lake Storage, Azure SQL)
- Experience with GitHub, GitLab, and GitHub Actions
- Authorized to work in the U.S. indefinitely without sponsorship
- Ability to obtain a public trust
- Experience building data pipelines and architectures to manage ingest, integration, and data product development for large-scale unstructured datasets (PDFs, documents) across multiple source systems
- Experience working with data platforms and processing tools, including Databricks and Spark
- Experience migrating legacy and on-premises data and systems to cloud environments
- Experience developing data products through a medallion architecture
- Knowledge and experience applying data governance and data quality management tools, frameworks, and best practices
- Amazon or Azure cloud certifications