What are the responsibilities and job description for the SRE Support Engineer position at NLB Services?
Primary Skills
- Python, PowerShell, C# / .Net , KQL
- Azure Monitor, App Insights, Logs, ADX
- Kubernetes, containers, CI/CD
- Azure architecture & reliability design
- Incident response automation tooling
Core Competencies
- Strong debugging mindset.
- Ability to automate.
- Deep telemetry analysis.
- Reliability-first engineering approach.
Job Description
Required Skill
Work description
Level of experience
Python / C# / .NET
SRE to build automation, diagnostics, and reliability features; these languages align with Microsoft stack and Ads platform tooling.
Basic / Intermediate (Not Full stack development experience needed)
Scripting (Python, PowerShell, Bash
Essential for reducing toil, writing auto-mitigation scripts, managing environments.
Basic / Intermediate
SRE / Data Engg OPS
LiveSite handling, on‑call fundamentals, incident response workflows.
Root cause analysis, post‑incident reviews, and preventive action planning.
Familiarity with: API failures, latency analysis, service degradation patterns
Mandatory
Cloud Architecture & Reliability Engineering
Azure Compute (VMs, AKS, Functions), networking, autoscaling
Capacity planning, load balancing, scaling architecture.
Resilient design patterns: retries, circuit breaker, throttling, graceful degradation.
Microservices & containerization (Kubernetes, Docker).
Mandatory:
PowerShell / Python, GitHub Actions, Azure DevOps, Bicep, AKS, Docker, Log Analytics, KQL dashboards, Azure Pipelines, Git/GitHub
Program management (Preferred)
Own communication with Data, PM, Support, and Leadership teams.
Ensure consistency in triage, incident workflow, and RCA processes.
Analyze customer pain points, classify themes, and drive improvements with Engineering.