What are the responsibilities and job description for the Software Engineer - Tech Lead - Scalable Data and Analytics Systems position at Alldus?
We’re looking for exceptional software engineers to design and scale next-generation data processing and analytics platforms that power OLTP, OLAP, and large-scale distributed data systems. You’ll architect and optimize pipelines and services that handle billions of records daily — enabling real-time transactions, analytical insights, and AI-driven decision-making.
This team is focused on building highly secure, intelligent data infrastructure at enterprise scale. We analyze massive amounts of structured and unstructured data to improve data integrity, security posture, and overall analytics performance across complex environments. Our systems process data at a global scale, supporting mission-critical workloads for large organizations.
Key Responsibilities
- Transactional & Analytical Systems: Design and implement highly scalable OLTP systems for real-time workloads and OLAP systems for complex analytical queries across massive datasets.
- Distributed Processing: Build, optimize, and maintain large-scale batch and streaming data pipelines using frameworks such as Apache Spark, Flink, Presto/Trino, or Kafka Streams.
- System Performance & Scale: Optimize systems for low-latency queries, high-throughput ingestion, and interactive analytics—ensuring seamless performance as data volumes grow to petabyte scale.
- Data Infrastructure: Develop and integrate with modern storage and compute platforms (e.g., Snowflake, BigQuery, Redshift, Cassandra, HDFS, Delta Lake, Iceberg) to support hybrid analytical and transactional workloads.
- Reliability & Observability: Ensure high availability, reliability, and robust monitoring across distributed compute and storage clusters with automated failover and recovery.
- Collaboration: Work closely with data scientists, ML engineers, and product teams to build unified, secure, and cost-efficient data platforms.
Required Skills & Experience
- Strong proficiency in Java, Scala, Python, or Go, with demonstrated experience building distributed back-end systems.
- Deep understanding of database internals, query optimization, indexing, and ACID vs. eventual consistency trade-offs.
- Hands-on experience with big data frameworks (e.g., Spark, Flink, Kafka) and distributed SQL engines (e.g., Presto, Trino, Hive, Impala).
- Expertise in designing OLAP/OLTP architectures for scale and high concurrency.
- Solid foundation in distributed systems, concurrency, parallelism, and caching techniques.
Preferred Qualifications
- Experience with HTAP (Hybrid Transactional/Analytical Processing) systems or real-time analytics platforms.
- Familiarity with data lakehouse architectures and formats such as Parquet, ORC, Delta, Iceberg, and Hudi.
- Knowledge of containerized deployments (Docker, Kubernetes) and cloud-native data architectures (AWS, GCP, Azure).
- Background in query engine development or contributions to open-source OLAP/OLTP frameworks.
What Success Looks Like
- OLAP systems that deliver sub-second interactive query performance across petabyte-scale datasets.
- Distributed data pipelines that power real-time and batch processing for AI/ML and large-scale analytics workloads.
Salary : $190,000 - $250,000