Demo

Lead Software Engineer (Site Reliability)

hackajob
Malvern, PA Full Time
POSTED ON 4/19/2026
AVAILABLE BEFORE 5/19/2026
hackajob is collaborating with Vanguard to connect them with exceptional professionals for this role.

Join Personal Investor Tech's Site Reliability Engineering team and lead cutting-edge SRE initiatives that impact hundreds of applications and millions of investors. You'll architect and build enterprise-scale resiliency solutions, driving our ambitious 2026 roadmap. This is an opportunity to combine deep technical expertise with strategic influence — designing OpenTelemetry integrations, implementing distributed tracing at scale, automating incident responses, and pioneering AI-enhanced diagnostics and analysis. Work alongside a collaborative, technically-focused team where your innovations in resilience engineering will shape Vanguard's next generation of client experiences.

Shape the Future of Observability at Vanguard

At Vanguard, we pride ourselves on delivering an exceptional client experience to all investors; at the core of this experience are systems that reside in a technically complex and constantly evolving resiliency landscape. Passionate, technically skilled engineers are at the center of our resiliency operations, and we are looking to grow our team.

We are seeking an experienced engineer with broad, end-to-end software development experience, including operating applications in a microservices environment in production at scale. This role goes beyond feature implementation - it requires someone who can design, build, and support resilient systems from the ground up.

As a Senior Reliability Engineer at Vanguard, you will play a critical role in solving impactful operational problems. You are curious and take a proactive approach to identifying problems and making improvements. You balance innovative thinking with pragmatism and understand the long-term impacts of technical decisions. You communicate complex ideas clearly and collaborate effectively to deliver scalable solutions.

Core Responsibilities

  • Improve resiliency engineering practices across platforms and applications, including resilient application design patterns, system observability and deployment strategies
  • Incident detection, troubleshooting, and resolution.
  • Develop automation for incident response and infrastructure management
  • Develop and support OpenTelemetry integrations for multiple application platforms (browser, ECS, lambda, etc) and languages (JavaScript, Java)
  • Contribute to architectural decisions and support implementation of solutions.

Skills And Qualifications

  • Deep knowledge of Java or Javascript. Practical experience developing and operating software in distributed systems environments.
  • Problem-solving and analytical thinking: ability to diagnose complex issues and propose efficient solutions. Strong debugging and optimization skills for performance and scalability.
  • Cloud platforms: Hands-on experience with AWS services and cloud infrastructure
  • System architecture and design: ability to design scalable, secure, and maintainable systems.
  • Working knowledge of Python (or similar scripting language).
  • Strong knowledge of resiliency engineering techniques for both platforms and applications.
  • Experience troubleshooting complex production issues and implementing effective mitigations.
  • Familiarity with OpenTelemetry specification and core APIs.

Special Factors

Sponsorship

Vanguard is not offering visa sponsorship for this position.

About Vanguard

At Vanguard, we don't just have a mission—we're on a mission.

To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

How We Work

Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

Salary : $120,000 - $180,000

If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets

What is the career path for a Lead Software Engineer (Site Reliability)?

Sign up to receive alerts about other jobs on the Lead Software Engineer (Site Reliability) career path by checking the boxes next to the positions that interest you.
Income Estimation: 
$92,877 - $110,401
Income Estimation: 
$120,933 - $155,034
Income Estimation: 
$114,618 - $136,401
Income Estimation: 
$97,257 - $120,701
Income Estimation: 
$123,167 - $152,295
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at hackajob

  • hackajob Wilmington, DE
  • hackajob is collaborating with J.P. Morgan to connect them with exceptional professionals for this role. Job Description Embrace the challenge of orchestra... more
  • 11 Days Ago

  • hackajob Wilmington, DE
  • hackajob is collaborating with J.P. Morgan to connect them with exceptional professionals for this role. Job Description Bring your expertise to JPMorgan C... more
  • 11 Days Ago

  • hackajob Wilmington, DE
  • hackajob is collaborating with J.P. Morgan to connect them with exceptional professionals for this role. Job Description Job Description Join the Finance D... more
  • 11 Days Ago

  • hackajob Newark, DE
  • hackajob is collaborating with J.P. Morgan to connect them with exceptional professionals for this role. Job Description Posting Description: The Firmwide ... more
  • 11 Days Ago


Not the job you're looking for? Here are some other Lead Software Engineer (Site Reliability) jobs in the Malvern, PA area that may be a better fit.

  • Coherent Corp. US Horsham, PA
  • Primary Duties & Responsibilities Software Engineer that can capably lead/work independently on design and development tasks while working within the commo... more
  • 14 Days Ago

  • Jobs via Dice Moorestown, NJ
  • Job Description What We're Doing At Lockheed Martin, We Are Passionate About innovation and integrity. We believe that by applying the highest standards of... more
  • 15 Days Ago

AI Assistant is available now!

Feel free to start your new journey!