What are the responsibilities and job description for the Automation Systems Engineer / Automation Platform Engineer - Remote position at Ohm Systems, Inc?
Job Description
High-Level Requirements – Contracted Role to Build Automation Systems
Role Title (suggested): Automation Systems Engineer / Automation Platform Engineer (Contract)
Role Overview: Design, build, and operationalize automation solutions that reduce manual operational toil, improve reliability, and standardize repeatable run activities. The contractor will deliver production-ready automations with appropriate controls, logging, and documentation, and will partner with Service Delivery and SRE stakeholders to prioritize, implement, and measure outcomes.
Objectives & Scope
- Identify and prioritize automation candidates (manual tasks, recurring incidents, high-MTTR activities/KPI and others assigned) in partnership with operations/SRE.
- Deliver automations that are secure, auditable, and supportable (monitoring, alerting, runbooks, and handoff).
- Standardize patterns and reusable components so solutions scale across domains/applications.
- Quantify impact (hours saved, number of times run, defect reduction, MTTR improvement) and feed these to a dashboard for reporting.
- Identify the run frequency that would indicate an underlying problem to address and setup notification for this condition.
Key Responsibilities
- Automation design & build: Create workflows/scripts/services to automate operational processes (e.g., ticket triage steps, data validations, batch reruns, retries, reconciliations, reporting).
- Platform integration: Integrate with enterprise tooling (ServiceNow, monitoring/observability, schedulers, CI/CD) as needed to run automations safely in production.
- Reliability controls: Implement guardrails (idempotency, retries/backoff, rate limits, circuit breakers), robust error handling, and recovery paths.
- Security & compliance: Follow least-privilege access, secrets management, logging standards, and audit requirements for systems and data.
- Testing & quality: Build unit/integration tests, perform non-prod validation, and define acceptance criteria aligned to operational readiness.
- Observability: Add structured logs, metrics, and health checks; define alerts/dashboards to detect failures and measure throughput.
- Documentation & handoff: Produce runbooks, support guides, and knowledge transfer for Tier 2/3 ownership.
- Stakeholder partnership: Participate in intake grooming, technical design reviews, and demos; provide weekly status and risks/issues.
Expected Deliverables
- Automation backlog & intake: A ranked inventory of automation candidates with effort/ROI estimates and success measures.
- Reusable automation framework: Templates/libraries for common patterns (config, logging, retries, auth, scheduling, notifications).
- Production-ready automations: Delivered in increments, each with code, tests, deployment artifacts, and operational documentation.
- Operational readiness package: Runbook, monitoring/alerting, rollback plan, and ownership handoff checklist for each automation.
- Metrics & reporting: A simple dashboard or report showing delivery progress and realized benefits (hours saved per month/year to date, number of times run, MTTR reduction, error rate, throughput). These metrics should be available per automation and for all automations.
Required Qualifications
- 5 years building automations in enterprise environments (platform, operations, or reliability automation).
- Strong development skills in at least one automation-friendly language (e.g., Python, Java, JavaScript/TypeScript, PowerShell, Bash, .NET, C ), and comfort with APIs.
- Experience building resilient, production-grade workflows (error handling, idempotency, retries, auditing).
- Experience with CI/CD and source control (e.g., Git), including branching and code review practices.
- Hands-on experience with Linux/Unix and scripting; ability to troubleshoot across logs, jobs, and infrastructure.
- Understanding of security fundamentals: least privilege, secrets management, non-person IDs/service accounts, and data handling.
- Ability to work from ambiguous requirements and translate operational pain points into implementable technical solutions.
- Experience documenting and training others on the automations created and how they are invoked-automatically when a condition occurs or manually to address an identified condition.
Preferred Qualifications (Nice to Have)
- RPA experience (e.g., UiPath, Automation Anywhere, Power Automate) and/or workflow orchestration (e.g., Airflow, Control-M, Ansible, TWS/IWS).
- ServiceNow development/automation (Flow Designer, IntegrationHub, scripting) and ticket lifecycle automation.
- Cloud-native automation (containers, Kubernetes, serverless functions) and infrastructure-as-code (e.g., Terraform).
- Observability tooling experience (logs/metrics/traces) and building dashboards/alerts.
- Experience in regulated environments (SOX/SOC), including evidence capture and audit-friendly change practices.
- Familiarity with data reconciliation, financial operations, or payment/rebates domains.
- Experience with Jira, Confluence and ServiceNow.
Engagement Model & Working Expectations
- Cadence: Weekly planning/prioritization; bi-weekly demos; weekly written status (progress, blockers, risks, next steps).
- Collaboration: Partner with Production Support Engineering for toil identification and acceptance; partner with SRE for standards (monitoring, reliability patterns).
- Definition of Done: Automation is deployed, monitored, documented, and handed off with ownership and support instructions.
- Change control: All production changes follow standard change processes, approvals, and evidence capture.
Success Metrics (Examples)
- Reduction in manual hours/week for targeted processes (baseline vs. post-automation).
- Decrease in recurring incident volume for automated failure patterns.
- Improvement in MTTR/MTTA for automatable recovery scenarios.
- Automation reliability: success rate, failure rate, and mean time to recovery of the automation itself.
- Adoption: number of teams/processes onboarded to standardized patterns and reusable components.
Assumptions & Dependencies
- Timely provisioning of required access (non-person IDs/service accounts) and test environments.
- Availability of SMEs for process walkthroughs, acceptance testing, and operational handoff.
- Agreement on target platforms/standards (logging, monitoring, deployment approach) to ensure consistency.