What are the responsibilities and job description for the Data Engineer (Spark / Google Cloud Platform – Data Migration) position at Euclid Innovations?
Data migration and pipeline development in a hybrid environment (Google Cloud Platform private cloud). Support ML workloads.
Required Skills
- Strong Spark Python (mandatory)
- Experience with data migration (on-prem → cloud)
- Hands-on with:
- Data pipelines (batch/streaming)
- Data transformation and processing
- Google Cloud Platform experience:
- BigQuery
- Cloud Storage
- Experience with schema handling, data quality, and validation
Nice to Have
- Experience in hybrid environments (Google Cloud Platform private cloud)
- Dataplex or data governance tools
- Feature engineering / ML data support
- Experience working with ML/AI teams
Key Responsibilities
- Migrate data from on-prem systems (HDFS, NFS, SQL Server, etc.) to Google Cloud Platform
- Build and optimize data pipelines using Spark and Python
- Ensure data availability for ML training and inference workflows
- Maintain data quality, consistency, and schema compatibility
- Collaborate with ML Engineers and MLOps teams