What are the responsibilities and job description for the Senior Platform DevOps Engineer position at BrezQ?
Senior Platform DevOps Engineer
AWS | Kubernetes | Terraform | Platform Engineering
Location: Basking Ridge, NJ / Dallas, TX / Tampa, FL (Hybrid/Onsite)
About BrezQ
BrezQ is a technology company specializing in Software Development, Cloud Engineering, AI Solutions, Enterprise Platforms, Data Engineering, and Digital Transformation. We help organizations build, modernize, and scale mission-critical technology ecosystems.
Position Overview
We are seeking a Senior Platform DevOps Engineer responsible for building and operating highly scalable, secure, automated cloud platforms supporting enterprise applications and digital services.
This role is focused on platform engineering, cloud infrastructure automation, Kubernetes operations, CI/CD enablement, observability, and production reliability.
The ideal candidate has extensive experience running production workloads in AWS, building Infrastructure as Code (IaC) solutions, managing Kubernetes clusters, and driving operational excellence across enterprise environments.
This is not a build-and-release-only DevOps role. We are looking for engineers who understand infrastructure, automation, security, reliability, and platform scalability.
What You Will Own
- AWS cloud infrastructure.
- Kubernetes platform operations.
- Infrastructure as Code (IaC).
- CI/CD platform engineering.
- Production reliability and uptime.
- Cloud security and governance.
- Observability and monitoring platforms.
- Platform automation initiatives.
- Incident response and root cause analysis.
- Developer enablement and self-service platforms.
Key Responsibilities
Cloud Platform Engineering
- Design and manage AWS infrastructure.
- Build highly available cloud environments.
- Implement scalable multi-environment deployment strategies.
- Optimize cloud performance and cost efficiency.
- Define infrastructure standards and best practices.
Kubernetes Engineering
- Build and maintain Kubernetes platforms.
- Manage cluster lifecycle and upgrades.
- Optimize workload performance and resource utilization.
- Implement autoscaling strategies.
- Design resilient containerized environments.
Infrastructure as Code
- Build and maintain Terraform modules.
- Automate infrastructure provisioning.
- Standardize cloud deployments.
- Maintain reusable infrastructure patterns.
CI/CD Engineering
- Design and maintain deployment pipelines.
- Improve deployment automation.
- Implement release management processes.
- Reduce deployment risk and improve delivery speed.
Reliability Engineering
- Establish SLIs, SLOs, and operational standards.
- Improve platform availability and resilience.
- Perform incident management and root cause analysis.
- Develop recovery and failover procedures.
Security & Compliance
- Implement cloud security best practices.
- Manage IAM policies and access controls.
- Integrate security scanning into CI/CD pipelines.
- Support compliance and audit requirements.
Required Technical Skills
Cloud Platforms
- AWS
- EC2
- EKS
- ECS
- VPC
- Route53
- S3
- RDS
- IAM
- CloudWatch
Kubernetes
- Kubernetes Administration
- Helm
- Ingress Controllers
- Service Mesh Concepts
- Cluster Operations
- Autoscaling
- Container Orchestration
Infrastructure as Code
- Terraform
- Infrastructure Automation
- Environment Provisioning
Containers
- Docker
- Container Security
- Image Management
CI/CD
- Jenkins
- GitHub Actions
- GitLab CI
- Continuous Integration
- Continuous Delivery
Monitoring & Observability
- Prometheus
- Grafana
- Datadog
- Splunk
- ELK Stack
Scripting & Automation
- Python
- Bash
- Shell Scripting
Networking
- DNS
- Load Balancers
- TCP/IP
- SSL/TLS
- Networking Fundamentals
Required Experience
- 7 years of DevOps, Platform Engineering, or Cloud Engineering experience.
- 4 years of AWS production experience.
- 3 years of Kubernetes production experience.
- Strong Terraform experience.
- Experience supporting enterprise-scale environments.
- Experience building CI/CD platforms.
- Experience implementing monitoring and observability solutions.
- Experience supporting 24x7 production systems.
Preferred Experience
- Telecommunications industry experience.
- Experience operating large-scale Kubernetes environments.
- AWS Certifications.
- Kubernetes Certifications.
- Experience supporting highly regulated enterprise environments.