What are the responsibilities and job description for the Data Reliability Engineer position at ExaTech Inc?
Data Reliability Engineer
Location: Basking Ridge, NJ
Fulltime/hybrid
About the Role:
As a Data Reliability Engineer II, you will play a crucial role in developing, optimizing, and managing several large data lakes and data warehouses, comprising data from multiple disparate sources.
Responsibilities:
- Proactively monitor PostgreSQL RDS instances for performance, availability, and resource utilization (CPU, memory, storage, connections) using established monitoring tools (e.g., CloudWatch, Prometheus).
- Assist in identifying performance bottlenecks in PostgreSQL RDS. Apply basic performance tuning techniques like reviewing query execution plans, adding missing indexes, and recommending parameter adjustments.
- Monitor the health and performance of Debezium and Kafka Connect connectors, identifying and troubleshooting basic issues related to data capture and delivery.
- Monitor ETL workflows and data pipelines for errors, performance bottlenecks, and processing delays. Troubleshoot and resolve issues to ensure reliable and timely data movement.
- Provide support for data related issues and participate in root cause analysis.
- Monitor the execution of Apache Airflow DAGs, identify failed tasks, and troubleshooting and re-runs.
- Develop and maintain automation scripts and infrastructure as code (IAC) templates (e.g., using Crossplane, Terraform) to automate routine database tasks, deployments, and updates.
- Participate in on-call rotations to respond to database-related incidents and perform troubleshooting and root cause analysis.
- Assist in implementing and maintaining security best practices for cloud databases, including access controls, encryption, and compliance with regulatory requirements.
- Regularly audit and assess database security configurations.
- Configure and manage database backup and recovery strategies to ensure data integrity and availability in case of failures or data loss.
- Analyse database query performance and collaborate with developers to optimize SQL queries and schemas.
- Participate in continuous improvement initiatives to enhance the reliability, scalability, and performance of cloud databases.
- Assist in the design and optimization of database schemas for cloud environments.
Skills:
- Familiarity with data pipeline concepts and technologies such as Debezium, Kafka Connect, and ETL frameworks.
- Basic understanding of Amazon Redshift and S3.
- Exposure to Apache Spark for data processing.
- Basic understanding of Apache Airflow for workflow orchestration.
- Strong SQL scripting skills for querying and basic data manipulation.
- Familiarity with scripting languages (e.g., Python, Bash) is a plus.
- Knowledge of database security best practices, including access controls, encryption, and compliance with regulatory requirements (e.g., GDPR, HIPAA).
- Having ‘AWS Certified Database - Specialty' certification is a plus.
Experience and Qualifications:
- Bachelor's degree in computer science, Information Technology, or a related field.
- 5-7 years of experience in database administration, with a focus on PostgreSQL.
- 2-4 years of hands-on experience with PostgreSQL RDS.