What are the responsibilities and job description for the Principal Reliability Lead Engineer position at AMD and Careers?
THE ROLE
As the Reliability Lead, you will serve as the technical authority for extended reliability modeling and silicon-level risk assessment across AMD’s next-generation x86 SoCs. This role is pivotal in ensuring our products meet rigorous reliability standards for automotive, industrial, networking, and edge markets. You will define strategies that safeguard product longevity and performance under diverse operating conditions, influencing design decisions from concept through production.
THE PERSON
We are seeking a reliability expert who thrives at the intersection of hardware and software. You bring deep physics-of-failure knowledge and system-level power and firmware expertise to the table. You will collaborate with architecture, design, and firmware teams to implement aging-aware performance projections, define DVFS policies, and validate models against silicon data. Your ability to anticipate reliability risks and drive mitigation strategies will ensure AMD delivers robust, long-lived solutions that power mission-critical environments.
KEY RESPONSIBILITIES
- Own extended reliability modeling for CPU/APU SoCs across aging modes (BTI, HCI, EM, TDDB).
- Collaborate with architecture and design teams to build aging-aware performance and power projections.
- Define fuse strategies, frequency guard bands, power limits, and DVFS policy guidelines for reliability compliance.
- Partner with firmware teams to implement power management hooks, throttling responses, and mitigation algorithms.
- Validate reliability models using silicon data; drive calibration and refine lifetime projections.
- Correlate models to silicon monitors (PVT, path monitors) and recommend FW/fuse changes for reliability attainment.
- Create cross-functional documentation, risk assessments, and recommendations for leadership and customers.
PREFERRED EXPERIENCE
- MS/PhD in EE/ECE with experience in SoC reliability, product engineering, or device physics.
- Strong understanding of silicon aging models, derating, and reliability qualification standards (AEC-Q100, JEDEC).
- Expertise in power delivery (SI/PI), IR-drop, EDC/TDC, clock/power gating; experience with reliability telemetry/PVT monitors.
- Proven ability to influence FW architecture (power/thermal/idle governors) and fusing/DFX policies.
- Strong data analysis skills (Python/SQL) and dashboarding for reliability tracking.
- Automotive/industrial reliability standards (AEC-Q100, ASIL), field-reliability analytics, and silicon lifecycle management experience are a big plus.
ACADEMIC CREDENTIALS
Bachelor’s or Master’s degree in Electrical Engineering, Computer Engineering, or a related field.
LOCATION:
#LI-TC1
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.