What are the responsibilities and job description for the Principal DevOps Engineer position at ServiceLink?
Principal DevOps Engineer
Location: Plano, TX (Onsite with 4 days a week in office)
Employment Type: Full-time
Total Compensation Range: $150,000 to $180,000
Applicants must be currently authorized to work in the United States on a full‑time basis and must not require sponsorship now or in the future.
About ServiceLink
ServiceLink is modernizing the mortgage services industry through AI‑accelerated engineering, intelligent automation, and next‑generation software delivery practices. We empower the nation’s top lenders and financial institutions with advanced technology, data‑driven insights, and high‑velocity product development models.
We’re not just evolving legacy workflows - we’re redefining how software is designed, built, tested, and deployed. Generative AI, autonomous systems, and continuous delivery are core to how we operate. Innovation isn’t optional here - it’s the expectation.
If you’re passionate about transforming engineering organizations and operationalizing AI‑driven development models at enterprise scale, you’ll thrive at ServiceLink.
About the Role
We’re hiring a Principal DevOps Engineer to lead our cloud platform, CI/CD, security, and reliability for both traditional services and agentic AI platform. You’ll own tooling and automation that help product teams ship faster and safer standardizing GitHub/Azure DevOps, Azure pipelines, infrastructure as code, runtime security, LLMOps, and agentic AI deployment/observability.
What You’ll Do
Agentic AI (Core Focus)
- Operationalize Agentic AI using industry leading Agentic AI Frameworks to deploy/operate single- and multi-agent systems with guardrails, state, memory, tool use (MCP), and workflow orchestration, including long-running agents hosted on Azure (e.g., App Service/Functions/Container Apps/AKS) following async patterns for durability and scale,
- Partner with AI engineers to select and integrate the right orchestration SDKs - Agent Framework, Semantic Kernel (agent orchestration patterns), and AutoGen (asynchronous, event‑driven multi-agent), and guide teams on when to use which for production vs. experimentation.
- Integration of regression tests (groundedness, relevance, safety), and wire evaluations to CI (GitHub Actions) so model/prompt changes must pass quality gates before release.
- Observability for agents: implement tracing/logging, metrics, and incident automation for multi-agent workflows
Engineering Metrics
- Operational Excellence & Reliability - Define and govern KPIs for system reliability, deployment performance, cost efficiency, and platform stability. Drive continuous improvement using DORA‑aligned metrics and AI specific indicators (e.g., evaluation quality, agent reliability).
- AI & Agentic Workload Quality - Introduce measurable standards for AI readiness and safety ensuring prompt flows, agents, and retrieval systems meet defined quality gates before release.
- Governance & Security Metrics - Implement executive dashboards for policy compliance, environment governance, and security posture across pipelines, IaC, and AI systems.
- Developer Experience & Platform Efficiency - Track and improve developer onboarding, reuse of platform components, and reduction in operational toil via automation and self‑service.
Artifact security:
- Design, build, and govern reusable CI/CD via GitHub Actions (and/or Azure DevOps as needed): multi-stage builds, environment promotions, approvals, matrix testing, dependency caching, and rollout strategies.
- Artifact security: integrate image/artifact signing and Azure Container Registry.
Security, Compliance
- Embed DevSecOps controls: Cred scans, Fortify, dependency scanning and secret scanning, policy checks in PRs, and break-glass governance for production environments
- Enforce responsible AI safeguards for generative/agentic workloads by configuring Azure OpenAI content filters and safety system messages, version these policies and test in CI using evaluation datasets
- Recommend new DevSecOps tools to the platform engineering team.
Required Qualifications
- 7 years in enterprise devops and cloud engineering
- Proven ownership of cloud platform engineering on Azure at scale.
- Deep hands-on with CI/CD pipelines.
- Knowledge of IaC with Terraform or Bicep; Azure networking/identity/security (VNETs, Private Endpoints, Entra ID RBAC, Key Vault, Policy).
- DevSecOps: Cred scans, Fortify, Mend, Container security scans, managed identity, bring your own key.
- Agentic AI & LLMOps: practical experience deploying LLM/agent apps using Azure AI Foundry (Prompt flow, evaluations), with working knowledge of Agent Framework (LangChain, CrewAI, MAF etc.) in production.
- Knowledge of vector databases, embeddings, or retrieval pipelines.
- Expertise in scripting/automation: PowerShell, Python and Bash for pipeline tasks, environment automation, and build tooling.
- Proficiency in using CodeGen tools.
Equal Opportunity Employer
ServiceLink, its affiliates, and subsidiaries are Equal Opportunity Employers. All qualified applicants will receive consideration without regard to race, color, religion, sex, age, disability, protected veteran status, national origin, sexual orientation, gender identity or expression, genetic information, or any other protected characteristic.
Salary : $150,000 - $180,000