What are the responsibilities and job description for the Lead Software Engineer- ETL/ELT Pipelines / Python / Pyspark position at JPMorgan Chase?
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible. Data is one of our most significant competitive assets and within our business, data is a crucial enabler for impactful initiatives that enhance efficiency and accelerate business growth.
As a Lead Software- ETL/ELT Pipelines / Python / Pyspark Engineer at JPMorgan Chase within the Asset and Wealth Management Technology Team, you will play a crucial role as part of an agile team dedicated to transforming and building client centric view of all investment data to unify client data in a secure, stable, and scalable manner. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
- Lead the development of secure high-quality production code, and review and debug code written by others
- Ensure data quality, integrity, and security across all data systems and platforms and enforce data governance policies and best practices
- Design and implement scalable data solutions that align with business objectives and technology strategies and technical troubleshooting with ability to think beyond routine or conventional approaches to build and support solutions or break down technical problems
- Design, develop, and optimize robust ETL/ELT pipelines using SQL, Python, and PySpark for large-scale, complex data environments
- Collaborate with cross-functional teams to understand data requirements and translate them into technical specifications
- Conduct performance tuning and optimization of data systems to ensure high availability and scalability
- Identify opportunities to eliminate or automate remediation of recurring issues to improve overall operational stability of software applications and systems
- Stay current on emerging ETL and data engineering technologies with industry trends to drive innovation
- Work closely with stakeholders to identify opportunities for data-driven improvements and efficiencies
- Maintain detailed documentation for pipelines, data models, and integration processes
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5 years applied experience
- Proven experience as a lead engineer in data management, ETL/ELT pipeline development, and large-scale data processing with strong hands-on coding proficiency in Python, PySpark, Apache Spark, SQL, and AWS cloud services such as AWS EMR, S3, Athena, Redshift
- Strong understanding of data quality, security, and lineage best practices
- Hands-on experience with AWS cloud and data lake platforms, Snowflake, Databricks etc
- Experience with cloud-based data warehouse migration and modernization
- Intimate knowledge and ability to implement unit, integration and functional testing strategies
- Experience providing the tools that will enable data to be made available on Mesh and distributed to meet consumer need
- Proficiency in automation and continuous delivery methods and understanding of agile methodologies such as CI/CD, Application Resiliency, and Security
- Excellent problem-solving and troubleshooting skills, with ability to optimize performance and troubleshoot complex data pipelines
- Strong communication and documentation abilities
- Ability to collaborate effectively with business and technical stakeholders
Preferred Qualifications and Skills
- Knowledge of Apache Iceberg
- In-depth knowledge of the financial services industry and IT systems