What are the responsibilities and job description for the Senior AWS PySpark Developer position at Silicontek Inc?
Role – Senior AWS PySpark Developer
Location – Hybrid – South San Francisco, CA
We are seeking an experienced Sr. AWS PySpark Developer with 8 -10 years of experience to design, build, and optimize our data pipelines and analytics architecture. The ideal candidate will have a strong background in data wrangling and analysis, with a deep understanding of AWS data services.
Key Responsibilities
Location – Hybrid – South San Francisco, CA
We are seeking an experienced Sr. AWS PySpark Developer with 8 -10 years of experience to design, build, and optimize our data pipelines and analytics architecture. The ideal candidate will have a strong background in data wrangling and analysis, with a deep understanding of AWS data services.
Key Responsibilities
- Design, build, and optimize robust data pipelines and data architecture on the AWS cloud platform.
- Wrangle, explore, and analyze large datasets to identify trends, answer business questions, and pinpoint areas for improvement.
- Develop and maintain a next-generation analytics environment, providing a self-service, centralized platform for all data-centric activities.
- Formulate and implement distributed algorithms for effective data processing and trend identification.
- Configure and manage Identity and Access Management (IAM) on the AWS platform.
- Collaborate with stakeholders to understand data requirements and deliver effective solutions.
- 8-10 years of experience as a Data Engineer or Developer.
- Proven experience building and optimizing data pipelines on AWS.
- Proficiency in scripting with Python.
- Strong working knowledge of:
- Big Data Tools: AWS Athena.
- Relational & NoSQL Databases: AWS Redshift and PostgreSQL.
- Data Pipeline Tools: AWS Glue, AWS Data Pipeline, or AWS Lake Formation.
- Container Orchestration: Kubernetes, Docker, Amazon ECR/ECS/EKS.
- Experience with wrangling, exploring, and analyzing data.
- Strong organizational and problem-solving skills.
- Experience with machine learning tools (SageMaker, TensorFlow).
- Working knowledge of stream processing (Kinesis, Spark-Streaming).
- Experience with analytics and visualization tools (Tableau, Power BI).
- Knowledge of optimizing AWS Redshift performance.
- Bachelor’s or Master’s Degree in Information Technology, Computer Science or relevant field.