What are the responsibilities and job description for the AIOps Engineer position at Career Listings?
Benefits:
- 401(k)
- 401(k) matching
- Competitive salary
- Dental insurance
- Health insurance
- Paid time off
- Profit sharing
- Training & development
- Tuition assistance
- Vision insurance
Primary Responsibilities:
- Cross-Functional Leadership: Lead the AIOps platform initiative by acting as the primary technical liaison to existing Network Engineering, ServiceNow, and SolarWinds administration teams to establish unified telemetry pipelines.
- ITSM Orchestration & Automation: Architect closed-loop remediation workflows by deeply integrating Splunk ITSI alerts with ServiceNow Event Management and Incident Management modules.
- Mission-Critical Observability: Architect and maintain Splunk AIOps solutions across unclassified and classified enclaves to provide real-time situational awareness.
- Infrastructure Telemetry Integration: Normalize and correlate network performance and fault data from SolarWinds with server and application logs to provide a holistic view of enterprise health.
- Advanced ML Development: Deploy custom machine learning models via Splunk MLTK to identify anomalous behavior, potential cyber threats, and infrastructure degradations.
- Secure Data Integration: Engineer secure data ingestion pipelines for telemetry data from cross-domain solutions and tactical edge devices.
- Incident Reduction: Utilize IT Service Intelligence (ITSI) to correlate multi-source events, reducing noise and prioritizing high-impact mission alerts.
- Cyber Defense Support: Collaborate with the Cyber Security Service Provider (CSSP) to integrate AIOps insights into defensive cyber operations (DCO).
- Compliance & Documentation: Ensure all observability tools comply with DoW STIGs and IL5/IL6 protocols; develop and maintain architectural documentation and compliance traceability.
- Mission Alignment: Stay current on AIOps and related capabilities relevant to DoD, federal, and intelligence mission systems.
Required Qualifications:
- Security Clearance: Active Top Secret / Sensitive Compartmented Information (TS/SCI) required at time of hire.
- Certification: Active IAT Level II certification (e.g., Security CE, CySA , GSEC, or SSCP) required.
- Citizenship: United States Citizenship is required.
- Platform Experience: 7 years of experience with Splunk Enterprise, including architectural design, cluster management, and advanced Search Processing Language (SPL).
- AIOps & ITSM: 3 years of experience implementing AIOps workflows, including integration with enterprise ITSM solutions (ServiceNow) for automated root cause analysis and remediation.
- Machine Learning: Proven track record of building, testing, and tuning supervised and unsupervised models within the Splunk MLTK.
- Scripting & Automation: Advanced scripting skills for developing custom search commands, API integrations, and automating remediation tasks (e.g., Python).
- Leadership: Experience leading technical working groups and directing the efforts of adjacent infrastructure and development teams.
- Operational Experience: Prior experience working within a DoW/DoD Operations Center (NOC/SOC) or supporting mission-critical systems and networks.
- Communication: Must be able to present designs, plans, and analyses of alternatives to technical leadership boards for approvals.
Desired Qualifications:
- Enterprise Aggregation: Experience aggregating and correlating telemetry from diverse tools, specifically SolarWinds, ServiceNow, and VMware vCenter.
- Expert Certification: Splunk Enterprise Certified Architect or Splunk ITSI Certified Admin.
- Cloud Observability: Experience with Cloud Native Computing Foundation (CNCF) observability tools in secure hybrid multi-cloud environments (Azure/AWS).
- RMF/ATO Knowledge: Understanding of the Risk Management Framework (RMF) and the Authorization to Operate (ATO) process for AI/ML workloads.