What are the responsibilities and job description for the Specialist - Data Engineering position at LTM?
Role description
Job Title: Pyspark Developer
Work Location : Irving, Texas
Job Summary
- We are seeking a highly skilled and motivated Data Engineer to play a pivotal role in designing building and optimizing our next generation scalable data pipelines This position requires expertise in processing massive datasets using cutting-edge technologies like Apache Spark PySpark and Hive within a dynamic cloud environment Your primary objective will be to ensure the utmost data reliability speed and efficiency providing a robust foundation for downstream business intelligence and advanced analytics initiatives
- Key Responsibilities
- Data Pipeline Development Maintenance Design build and maintain highly scalable and efficient ETLELT data pipelines utilizing PySpark and Spark SQL for complex data transformations
- Cloud Data Infrastructure Management Deploy manage and scale critical data infrastructure components on leading cloud platforms such as Amazon Web Services AWS eg EMR Glue Microsoft Azure eg Databricks Synapse or Google Cloud Platform GCP
- Data Warehousing Storage Optimization Strategically manage data layout partitioning and indexing within Apache Hive and various cloud data lake solutions to optimize performance and accessibility
- Performance Tuning Optimization Proactively identify and resolve performance bottlenecks in Spark jobs leveraging Spark UI for indepth analysis effectively managing data skewness and optimizing memory utilization
- Diverse Data Integration Develop robust solutions for ingesting highvolume and diverse datasets from both structured relational databases and unstructured flat files into our data ecosystem
- Automated Workflow Orchestration Implement and manage automated data workflows using industrystandard scheduling tools like Apache Airflow or platformnative schedulers ensuring timely and reliable data delivery
- Strategic Collaboration Partner closely with data scientists business analysts and crossfunctional enterprise teams to translate complex business requirements into technically sound and efficient data solutions
- Required Core Technical Skills
- Big Data Frameworks Expertise Demonstrated high proficiency in Apache Spark architecture including a deep understanding of drivers executors and Directed Acyclic Graphs DAGs
- Advanced Programming Exceptional coding skills in Python and extensive experience with the PySpark API for developing intricate data transformations and processing logic
- Querying Schema Management Strong command of HiveQL and ANSI SQL coupled with expertise in data partitioning techniques and effective schema definition
- Optimized Storage Formats Indepth understanding and practical experience with optimized big data storage file formats such as Parquet ORC and Avro
- Cloud Ecosystem Development Handson development experience utilizing cloudnative big data utilities eg AWS EMR Azure Databricks within major cloud platforms
- Data Warehousing Fundamentals Solid foundation in Dimensional Data Modeling including Star and Snowflake schemas and practical experience with Data Lakes concepts and implementation
- Preferred Qualifications
- CICD DevOps Automation Experience with Continuous IntegrationContinuous Deployment CICD practices and automation tools like Git Jenkins or Ansible
- NoSQL Database Integration Exposure to and experience with NoSQL databases such as HBase Cassandra or MongoDB
- Professional Cloud Certifications Relevant professional cloud certifications eg AWS Certified Data Engineer Microsoft Certified Azure Data Engineer Associate are highly valued
Actual compensation within the range will be dependent upon the individual's skills, experience, performance and internal equity.
Benefits/perks listed below may vary depending on the nature of your employment with LTIMindtree (“LTIM”):
Benefits and Perks:
- Comprehensive Medical Plan Covering Medical, Dental, Vision
- Short Term and Long-Term Disability Coverage
- 401(k) Plan with Company match
- Life Insurance
- Vacation Time, Sick Leave, Paid Holidays
- Paid Paternity and Maternity Leave
The range displayed on each job posting reflects the minimum and maximum salary target for the position across all US locations. Within the range, individual pay is determined by work location and job level and additional factors including job-related skills, experience, and relevant education or training. Depending on the position offered, other forms of compensation may be provided as part of overall compensation like an annual performance-based bonus, sales incentive pay and other forms of bonus or variable compensation.
Disclaimer: The compensation and benefits information provided herein is accurate as of the date of
this posting.
LTIMindtree is an equal opportunity employer that is committed to diversity in the workplace. Our
employment decisions are made without regard to race, color, creed, religion, sex (including
pregnancy, childbirth or related medical conditions), gender identity or expression, national origin,
ancestry, age, family-care status, veteran status, marital status, civil union status, domestic
partnership status, military service, handicap or disability or history of handicap or disability, genetic
information, atypical hereditary cellular or blood trait, union affiliation, affectional or sexual orientation
or preference, or any other characteristic protected by applicable federal, state, or local law, except
where such considerations are bona fide occupational qualifications permitted by law.
Salary : $83,912 - $128,080