What are the responsibilities and job description for the Systems Operations Engineer position at Jobs via Dice?
Location: Charlotte, NC
Salary: $61.00 USD Hourly - $66.00 USD Hourly
Description:
Job Title: Senior Site Reliability Engineer (Systems Operations Engineer)
Location: Charlotte, NC or Irving, TX
Schedule: Hybrid - 3 days per week onsite (mandatory)
Contract: 18 months (with possible extension and eligibility for conversion)
About The Role
We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to support key Shared Services Operations Technology platforms, including Payment Evaluations, Regulatory Operations, Financial Crimes, and Business & Real Estate Evaluation. You will be part of a team responsible for maintaining availability, performance, and reliability across ~85 applications that support KYC, AML, and other critical financial-crimes-related workloads.
This role blends software engineering, systems operations, and cloud-native reliability practices to drive automation, enhance resilience, and support modernization across a large enterprise ecosystem. You will also help evolve AIOps capabilities, including predictive alerting, self-healing workflows, and AI/ML-driven incident analysis.
Some occasional weekend work or overtime may be required for critical system support.
What You'll Do
Site Reliability & Operations
Software Development
Contact:
This job and many more are available through The Judge Group. Please apply with us today!
Salary: $61.00 USD Hourly - $66.00 USD Hourly
Description:
Job Title: Senior Site Reliability Engineer (Systems Operations Engineer)
Location: Charlotte, NC or Irving, TX
Schedule: Hybrid - 3 days per week onsite (mandatory)
Contract: 18 months (with possible extension and eligibility for conversion)
About The Role
We are seeking a highly skilled Senior Site Reliability Engineer (SRE) to support key Shared Services Operations Technology platforms, including Payment Evaluations, Regulatory Operations, Financial Crimes, and Business & Real Estate Evaluation. You will be part of a team responsible for maintaining availability, performance, and reliability across ~85 applications that support KYC, AML, and other critical financial-crimes-related workloads.
This role blends software engineering, systems operations, and cloud-native reliability practices to drive automation, enhance resilience, and support modernization across a large enterprise ecosystem. You will also help evolve AIOps capabilities, including predictive alerting, self-healing workflows, and AI/ML-driven incident analysis.
Some occasional weekend work or overtime may be required for critical system support.
What You'll Do
Site Reliability & Operations
- Lead SRE practices that enhance system availability, performance, and scalability across multi-cloud environments.
- Support and improve critical applications and customer journeys; lead incident response and blameless postmortems.
- Conduct root-cause analysis and drive long-term remediation of recurrent issues.
- Define and enforce operational readiness and Non-Functional Requirements (NFRs) during platform modernization.
- Design and implement automation to eliminate operational toil and improve service reliability.
- Build frameworks for automated SLO/SLI tracking, availability metrics, error budgeting, and customer impact analysis.
- Implement self-healing and autonomic systems using AI/ML, RPA, and intelligent monitoring.
- Develop and enhance monitoring, alerting, and observability capabilities.
- Drive adoption of AIOps platforms to support anomaly detection, predictive alerting, and automated incident resolution.
- Collaborate with platform teams, product owners, and technology partners across the COO Technology organization.
- Mentor peers and champion SRE best practices across engineering teams.
- Identify process gaps across domains and recommend scalable, long-term improvements.
- 5 years in Systems Engineering, Site Reliability Engineering, Technology Architecture, or related fields (or equivalent military/training/education experience).
- 2 years performing as part of an SRE team.
- Strong written and verbal communication skills.
Software Development
- Proficiency in Python and/or Java/J2EE.
- Experience with REST APIs, microservices, Kafka/MQ, and modern integration patterns.
- Familiarity with JavaScript frameworks (React, Bootstrap).
- Strong SQL skills and database schema design experience.
- Expertise with Linux and container orchestration (Kubernetes, OpenShift/OCP strongly preferred).
- Experience with PCF, AWS, Google Cloud Platform, or Azure environments.
- Tools: Jenkins, GitLab, SonarQube, Artifactory, Ansible.
- Tools: Grafana, Prometheus, Splunk/ELK, AppDynamics, Elastic, ThousandEyes, Aternity, Google Cloud Logging.
- AIOps Platforms: Moogsoft, AI/ML-based analytics frameworks.
- ITSM Tools: ServiceNow, Remedy, IBM Netcool.
- Databases: Oracle, DB2, SQL Server, MongoDB, Hadoop/Cloudera, Spark, Teradata.
- Understanding of common AI/ML concepts (classification, regression, clustering, anomaly detection).
- Ability to work with structured/unstructured data for model evaluation.
- Awareness of ethical/operational considerations in AI systems.
- Experience integrating AI into automation workflows is a plus.
- Experience with AutoSys.
- Prior experience in corporate banking or financial services.
- Strong interest in AI-driven operations and AIOps.
Contact:
This job and many more are available through The Judge Group. Please apply with us today!
Salary : $61 - $66