What are the responsibilities and job description for the W2 Sr. Cloud Data Integration Engineer IICS CDI / AWS / PySpark / Python(Hybrid) position at Jobs via Dice?
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Cliff Services Inc, is seeking the following. Apply via Dice today!
W2 Sr. Cloud Data Integration Engineer IICS CDI / AWS / PySpark / Python
Location: McLean, VA (Hybrid)
Interview: Video
Experience: 8 Yeara
About The Role
We are seeking a Senior Cloud Data Integration Engineer with deep hands-on expertise in Informatica Intelligent Cloud Services - Cloud Data Integration (IICS CDI) to join a high-impact data engineering team. The ideal candidate brings strong experience designing, building, and optimizing enterprise-grade data pipelines on cloud platforms, with solid proficiency in AWS, PySpark, and Python. This role is central to enabling scalable, governed data movement and transformation across complex enterprise environments.
IICS CDI refers to Informatica Intelligent Cloud Services - Cloud Data Integration, a cloud-native ETL/ELT and data integration platform used for building and orchestrating data pipelines across cloud and on-premise sources.
Key Responsibilities
W2 Sr. Cloud Data Integration Engineer IICS CDI / AWS / PySpark / Python
Location: McLean, VA (Hybrid)
Interview: Video
Experience: 8 Yeara
About The Role
We are seeking a Senior Cloud Data Integration Engineer with deep hands-on expertise in Informatica Intelligent Cloud Services - Cloud Data Integration (IICS CDI) to join a high-impact data engineering team. The ideal candidate brings strong experience designing, building, and optimizing enterprise-grade data pipelines on cloud platforms, with solid proficiency in AWS, PySpark, and Python. This role is central to enabling scalable, governed data movement and transformation across complex enterprise environments.
IICS CDI refers to Informatica Intelligent Cloud Services - Cloud Data Integration, a cloud-native ETL/ELT and data integration platform used for building and orchestrating data pipelines across cloud and on-premise sources.
Key Responsibilities
- Design, develop, and maintain end-to-end data integration pipelines using IICS CDI (Informatica Intelligent Cloud Services - Cloud Data Integration)
- Build and manage mappings, mapping tasks, taskflows, and orchestration workflows within IICS CDI
- Develop and optimize data ingestion, transformation, and loading pipelines across structured, semi-structured, and unstructured data sources
- Integrate data from diverse sources including relational databases, flat files, APIs, cloud storage, and SaaS applications into target systems
- Build and maintain PySpark-based data transformation pipelines for large-scale distributed data processing
- Write Python scripts for data automation, pipeline orchestration, data validation, and utility development
- Leverage AWS services including S3, Glue, Lambda, Redshift, RDS, EMR, and Step Functions for cloud-based data engineering workflows
- Implement CDC (Change Data Capture) patterns and incremental load strategies within IICS CDI pipelines
- Collaborate with data architects, business analysts, and downstream consumers to understand requirements and translate them into scalable integration solutions
- Monitor, troubleshoot, and optimize pipeline performance, identifying and resolving bottlenecks
- Implement data quality checks, validation rules, and exception handling within integration pipelines
- Maintain technical documentation including data flow diagrams, mapping specifications, and pipeline runbooks
- Participate in code reviews, CI/CD pipeline integration, and deployment across dev, test, and production environments
- Ensure data security, compliance, and governance standards are maintained across all integration workflows
- 8 years of overall experience in data integration and data engineering
- 5 years of hands-on experience with Informatica IICS CDI - including mappings, taskflows, connectors, runtime environments, and Secure Agent configuration
- Strong experience with AWS cloud services: S3, Glue, Redshift, Lambda, EMR, Step Functions, IAM, and CloudWatch
- Proficiency in PySpark for distributed data transformation and large-scale data processing
- Solid Python programming skills for scripting, automation, and pipeline support
- Strong SQL skills for data validation, transformation logic, and backend querying
- Experience with CDC patterns, incremental loading, and real-time or near-real-time data integration
- Familiarity with REST API integration and web service-based data extraction within IICS
- Experience with version control tools: Git, Bitbucket, or GitHub
- Strong understanding of data warehousing concepts, dimensional modeling, and ETL/ELT best practices
- Excellent problem-solving, communication, and cross-functional collaboration skills
- Experience with Informatica MDM, IDMC, or CAI (Cloud Application Integration) in addition to CDI
- Familiarity with Snowflake or Redshift as target data warehouse platforms
- Experience with Apache Airflow or AWS Step Functions for pipeline orchestration
- Knowledge of data governance, lineage tracking, and metadata management
- Exposure to Agile/Scrum delivery methodologies
- Informatica or AWS cloud certifications are a plus