What are the responsibilities and job description for the Lead Software Engineer - DevOps Engineering/Cloud position at JPMorgan Chase?
We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.
As a Lead Software Engineer at JPMorgan Chase within the Consumer and Community Banking, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way.. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
- Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Develops secure high-quality production code, and reviews and debugs code written by others
- Identifies opportunities to eliminate or automate remediation of recurring issues to improve overall operational stability of software applications and systems
- Leads evaluation sessions with external vendors, startups, and internal teams to drive outcomes-oriented probing of architectural designs, technical credentials, and applicability for use within existing systems and information architecture
- Leads communities of practice across Software Engineering to drive awareness and use of new and leading-edge technologies
- Adds to team culture of diversity, opportunity, inclusion, and respect
Required qualifications, capabilities, and skills
- 8 years of hands‑on software/platform engineering experience, including leading cloud‑native delivery for business‑critical systems.
- Strong Kubernetes expertise (workloads, networking, security, autoscaling, upgrades, troubleshooting); experience operating clusters in production.
- Expert Infrastructure as Code with Terraform (modules, state backends, workspaces, CI integration, policy controls).
- Proficiency in Python for platform automation, tooling, and systems scripting; familiarity with Bash/YAML/Helm.
- Deep experience with CI/CD (e.g., Jenkins, Spinnaker/Argo), artifact management, and automated testing strategies.
- Strong AWS/public cloud knowledge (VPC, ALB/NLB, ECR/EKS, IAM, KMS, CloudWatch/CloudTrail) and cloud networking fundamentals.
- Solid understanding of SDLC and agile practices; champions secure coding, resiliency patterns, and release engineering.
- Observability at scale: Prometheus/Grafana, datadog, log aggregation (e.g., Splunk), actionable SLO/SLA monitoring.
- Demonstrated reduction of operational toil through automation and SRE practices (incident response, blameless postmortems, remediation).
- Practical experience applying agentic AI/LLM capabilities to DevSecOps use cases (e.g., assisted troubleshooting, code/IaC generation with review, runbook automation) with attention to accuracy, guardrails, and auditability.
- Excellent communication and leadership skills; ability to influence architecture and mentor engineers across teams.
Preferred qualifications, capabilities, and skills
- Programming & Scripting: Expert-level Python is mandatory, along with proficiency in Bash, Java, or C for building automation scripts.
- ML Frameworks & Tools: Hands-on experience with TensorFlow, PyTorch, or Scikit-learn, plus MLOps tools like MLflow, Kubeflow, Vertex AI, or DVC.
- Infrastructure & Cloud: Strong knowledge of AWS, Azure, or GCP, including serverless architectures, storage solutions, and network configuration.
- Containerization & DevOps: Expert skills in Kubernetes (K8s), Docker, Helm, GitOps, and CI/CD pipelines (Jenkins, GitLab CI).
- Monitoring & Reliability: Experience setting up monitoring for both infrastructure and models (drift detection, model accuracy) using Prometheus/Grafana.
- Database Systems: Proficiency in managing SQL/NoSQL databases to handle data for training and inference.