What are the responsibilities and job description for the Big Data Engineer position at Tential Solutions?
We are seeking a highly skilled and experienced Big Data Engineer to design, develop, and optimize large-scale data processing systems. In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms. The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark etc
Responsibilities
AI Tool Proficiency
Responsibilities
- Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala)
- Implement data ingestion, storage, transformation, and analysis solutions that are scalable, efficient, and reliable
- Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture
- Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions
- Optimize and enhance existing data pipelines for performance, scalability, and reliability
- Develop automated testing frameworks and implement continuous testing for data quality assurance
- Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines
- Work with data scientists and analysts to support data-driven decision-making across the organization
- Write and maintain automated unit, integration, and end-to-end tests
- Monitor and troubleshoot data pipelines in production environments to identify and resolve issues
AI Tool Proficiency
- Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
- Strong software development background with the ability to contribute to technical discussions
- Extensive experience with Scrum, Kanban, and continuous improvement practices
- Experience with big data technologies such as Hadoop, Spark, Hive, and Trino
- Understanding of common challenges such as:
- Data skew and mitigation strategies
- Working with massive data volumes (petabyte scale)
- Troubleshooting job failures related to resource constraints, bad data, and scalability issues
- Ability to provide real-world debugging and mitigation examples
- Prompt engineering: ability to craft effective prompts for AI coding assistants and analysis tools
- AI workflow design: experience leveraging AI to redesign development processes
- Data analysis: ability to interpret AI-generated insights and translate them into actionable improvements
- Change management: experience supporting AI adoption and workflow transformation
- Proficiency in SQL including window functions, multi-table joins, and aggregations
- Ability to write and optimize complex queries
- Experience handling edge cases such as NULLs, duplicates, and ordering
- Strong understanding of Spark architecture (executors, tasks, stages, DAG)
- Experience with performance tuning techniques (partitioning, caching, broadcast joins)
- Ability to troubleshoot slow or failing jobs and resolve resource bottlenecks
- Experience optimizing jobs for large-scale datasets
- Experience with AWS services such as S3, EMR, Glue, Lambda, Athena, etc.
- Experience working with S3 in Spark environments (file formats, consistency challenges, etc.)
- Familiarity with EKS and serverless technologies
- Ability to write clean, modular, and performant code
- Experience with functional programming concepts (immutability, higher-order functions)
- Understanding of collections, concurrency, and memory management
- Experience building scalable data processing systems
- Bachelor’s degree in Computer Science, Information Systems, or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience
- Master’s degree and financial services industry experience preferred
- Demonstrated technical expertise in object-oriented and database technologies leading to enterprise-quality solutions
- Experience developing enterprise solutions in an iterative or Agile environment
- Extensive knowledge of test automation, build automation, and configuration management frameworks
- Strong written and verbal communication skills
- Proven ability to build effective working relationships and improve quality of work products
- Strong organizational skills with the ability to manage competing priorities
- Ability to learn new technologies quickly and work in a fast-paced environment
- Experience with object-oriented programming languages such as Java, Scala, or Python
- Experience managing production data pipelines and ETL systems
- Experience with CI/CD pipelines
- Experience writing test cases
- AWS certifications