What are the responsibilities and job description for the Big Data Developer (Scala & Java) position at RIIM?
Big Data Developer (Scala & Java) – Job Description
We are looking for an experienced Big Data Developer with strong Scala and Java expertise to build scalable data processing systems and high-performance data pipelines. The ideal candidate should have hands-on experience with distributed computing, real-time data streaming, and cloud-based big data technologies.
Responsibilities
- Design, develop, and maintain large-scale big data applications using Scala and Java.
- Build scalable ETL/ELT pipelines for processing structured and unstructured data.
- Develop distributed data processing solutions using Apache Spark, Hadoop, and Kafka.
- Implement real-time streaming applications and event-driven architectures.
- Optimize big data applications for performance, scalability, and reliability.
- Work closely with data engineers, architects, analysts, and business teams to deliver data solutions.
- Develop RESTful APIs and microservices for data integration and processing.
- Perform data validation, cleansing, transformation, and aggregation activities.
- Monitor and troubleshoot production data pipelines and resolve performance bottlenecks.
- Participate in code reviews, design discussions, and Agile development activities.
Required Skills
- Strong programming experience in Scala and Java.
- Hands-on expertise with Apache Spark (Core, SQL, Streaming) and Hadoop ecosystem.
- Experience with Kafka or other messaging/streaming platforms.
- Knowledge of distributed systems and parallel data processing concepts.
- Strong SQL skills and experience with relational and NoSQL databases.
- Experience building ETL/data pipeline solutions.
- Familiarity with Hive, HBase, Cassandra, MongoDB, or Snowflake.
- Experience with cloud platforms such as AWS, Azure, or GCP.
- Hands-on experience with Docker, Kubernetes, Jenkins, Git, and CI/CD pipelines.
- Understanding of data modeling, data warehousing, and big data architecture.
Preferred Qualifications
- Experience with Databricks, Delta Lake, or cloud-native data platforms.
- Knowledge of Airflow, Oozie, or workflow orchestration tools.
- Experience in real-time analytics and streaming data architectures.
- Exposure to machine learning pipelines or AI/ML data processing.
- Experience working in Agile/Scrum environments.
- Bachelor’s degree in Computer Science, Engineering, or related field.
Nice to Have
- Experience with Python for data engineering tasks.
- Knowledge of Terraform or Infrastructure as Code.
- Cloud certifications in AWS, Azure, or GCP.
- Experience in finance, healthcare, telecom, or e-commerce domains.