What are the responsibilities and job description for the Data Engineer with GCP & Scala - Only W2 position at Saransh Inc?
Role: Data Engineer with GCP & Scala - Only W2
Location: Bentonville, AR (Onsite from Day 1)
Job Type: W2 Contract
Only W2 - No C2C (or) Third-party candidates
Mandatory Areas
Must Have Skills – Data Engineer with Scala
Scala, Spark, Python, SQL, Bigdata, Hadoop
GCP data tools: BigQuery, Dataproc, Vertex AI, Pub/Sub, Cloud Functions
PySpark, Python, SparkSQL, and data modeling
Total IT Experience – Minimum 8 Years
GCP - 4 years of recent GCP experience
Description
We are seeking a Data Engineer with Spark & SCALA.Streaming skills builds real-time, scalable data pipelines using tools like Spark, Kafka, and cloud services (GCP) to ingest, transform, and deliver data for analytics and ML.
Responsibilities
Location: Bentonville, AR (Onsite from Day 1)
Job Type: W2 Contract
Only W2 - No C2C (or) Third-party candidates
Mandatory Areas
Must Have Skills – Data Engineer with Scala
Scala, Spark, Python, SQL, Bigdata, Hadoop
GCP data tools: BigQuery, Dataproc, Vertex AI, Pub/Sub, Cloud Functions
PySpark, Python, SparkSQL, and data modeling
Total IT Experience – Minimum 8 Years
GCP - 4 years of recent GCP experience
Description
We are seeking a Data Engineer with Spark & SCALA.Streaming skills builds real-time, scalable data pipelines using tools like Spark, Kafka, and cloud services (GCP) to ingest, transform, and deliver data for analytics and ML.
Responsibilities
- As a Senior Data Engineer, you will Design, develop, and maintain ETL/ELT data pipelines for batch and real-time data ingestion, transformation, and loading using Spark (PySpark/Scala) and streaming technologies (Kafka, Flink).
- Build and optimize scalable data architectures, including data lakes, data warehouses (BigQuery), and streaming platforms.
- Performance Tuning: Optimize Spark jobs, SQL queries, and data processing workflows for speed, efficiency, and cost-effectiveness
- Data Quality: Implement data quality checks, monitoring, and alerting systems to ensure data accuracy and consistency.
- Programming: Strong proficiency in Python, SQL, and potentially Scala/Java.
- Big Data: Expertise in Apache Spark (Spark SQL, DataFrames, Streaming).
- Streaming: Experience with messaging queues like Apache Kafka, or Pub/Sub.
- Cloud: Familiarity with GCP, Azure data services.
- Databases: Knowledge of data warehousing (Snowflake, Redshift) and NoSQL databases.
- Tools: Experience with Airflow, Databricks, Docker, Kubernetes is a plus.