Demo

Supervisor - Server Repair Engineering

SMS INFOCOMM CORPORATION
Grapevine, TX Full Time
POSTED ON 5/7/2026
AVAILABLE BEFORE 7/7/2026

Position Overview

We are seeking a senior engineering leader to serve as the Supervisor of AI Server Repair Engineering & Process.

This is a foundational role responsible for architecting, defining, and continuously improving the entire technical framework for diagnosing and repairing our complex, high-value AI server infrastructure.

More than a traditional supervisor, you are the lead repair engineer and process owner.

You will leverage your deep hardware expertise to develop systematic, data-driven, and scalable repair processes from the ground up.

You will not only lead a team of technicians and junior engineers but also act as their primary technical mentor and the engineering liaison to our core Product Design and Quality teams.

Your mission is to transform our repair facility into a center of excellence by embedding engineering discipline into every aspect of our service operations.

Key Responsibilities

1. Process Architecture & Definition (Primary Focus):

* Architect and Author: Design, document, and deploy the end-to-end technical workflow for AI server repair. This includes creating detailed Standard Operating Procedures (SOPs), diagnostic flowcharts, decision trees, and work instructions.

* Test Plan Development: Define and validate comprehensive test plans and validation criteria for all repaired components and full systems, ensuring they meet strict performance and reliability standards before being returned to service.

* Tooling & Automation: Identify, develop, and implement diagnostic scripts, software tools, and physical fixtures to improve the accuracy, consistency, and efficiency of the troubleshooting and repair process.

* Process Control: Establish critical control points within the repair process to ensure quality and gather vital failure data.

2. Advanced Engineering Support & Failure Analysis (Primary Focus):

* Technical Authority: Serve as the ultimate escalation point for the most complex hardware failures that elude standard diagnostic procedures.

* Root Cause Analysis (RCA): Lead systematic deep dives into new and recurring failure modes. Perform board-level analysis, interpret schematics, and collaborate with the team to isolate the root cause.

* Engineering Feedback Loop: Act as the primary technical interface between the repair center and core Hardware Engineering/R&D. Consolidate, analyze, and present failure data and RCA findings to influence future product design for improved serviceability and reliability (Design for Serviceability).

3. Operational Leadership & Team Enablement:

* Technical Mentorship: Lead and develop the technical capabilities of the repair team. Provide hands-on training on new products, advanced diagnostic techniques, and established repair processes.

* Enablement, Not Just Delegation: Empower the team by ensuring they have the processes, tools, and knowledge required to succeed. Focus on removing technical roadblocks and fostering an environment of structured problem-solving.

* Performance Management: Set clear technical objectives, manage workflow priorities based on engineering needs, and guide the professional growth of team members.

4. Data-Driven Continuous Improvement:

* Analyze Repair Data: Systematically collect and analyze repair data (failure modes, component usage, test yields) to identify trends and opportunities for process optimization.

* Drive Improvements: Initiate and lead engineering change requests (ECRs) and process improvement projects based on data analysis to enhance repair quality, reduce turn-around time, and lower costs.

Qualifications:

Qualifications & Skills

Required Qualifications (Must-Haves):

* Education: Bachelor’s degree in Electrical Engineering, Computer Engineering, Manufacturing Engineering, or a closely related field.

* Experience: * 4 years in a technical engineering role such as Test Engineering, Manufacturing Engineering, Hardware Sustaining, or high-level Repair Engineering.

* Proven track record of developing and documenting technical processes (SOPs, test plans, work instructions) from scratch in a manufacturing or repair environment.

* 3 years in a technical leadership role, mentoring junior engineers or technicians.

* Technical Expertise:

      * Expert-level ability to read and interpret electronic schematics, board layout files, and product specifications.

      * Strong, hands-on experience with systematic hardware troubleshooting methodologies for complex systems (e.g., servers, networking equipment).

      * Demonstrated proficiency in scripting (Python, Bash, or similar) to automate diagnostic tests and parse data logs.

      * Deep knowledge of server components and architecture, including GPUs, high-speed interconnects (InfiniBand/Ethernet), CPUs, and power systems.

 

Preferred Qualifications (Nice-to-Haves):

* Master’s degree in Electrical or Computer Engineering.

* Experience with Design for Manufacturability (DFM) or Design for Serviceability (DFS) principles.

* Certification and practical application of Lean Manufacturing or Six Sigma methodologies.

* Experience with analyzing failure and yield data.

* Hands-on experience with board-level repair techniques (e.g., soldering, BGA rework) is a strong plus.

Salary.com Estimation for Supervisor - Server Repair Engineering in Grapevine, TX
$177,296 to $209,800
If your compensation planning software is too rigid to deploy winning incentive strategies, it’s time to find an adaptable solution. Compensation Planning
Enhance your organization's compensation strategy with salary data sets that HR and team managers can use to pay your staff right. Surveys & Data Sets
Employees: Get a Salary Increase
View Core, Job Family, and Industry Job Skills and Competency Data for more than 15,000 Job Titles Skills Library

Job openings at SMS INFOCOMM CORPORATION

  • SMS INFOCOMM CORPORATION Grapevine, TX
  • Summary The position is responsible for performing janitorial duties for all buildings and related areas. Essential Duties and Responsibilities include the... more
  • 1 Day Ago

  • SMS INFOCOMM CORPORATION Grapevine, TX
  • Summary The position of Material Handler II maintains inventory by identifying labeling, and placing materials and supplies in stock; locates materials and... more
  • 1 Day Ago

  • SMS INFOCOMM CORPORATION Worth, TX
  • Summary The Engineer will provide technical support to production to overcome daily challenges. Engineer will also use procedures and instructions to initi... more
  • 1 Day Ago

  • SMS INFOCOMM CORPORATION Grapevine, TX
  • Summary The Engineering Technician is responsible for developing and testing solutions to solve technical problems related to all aspects of computer repai... more
  • 1 Day Ago


Not the job you're looking for? Here are some other Supervisor - Server Repair Engineering jobs in the Grapevine, TX area that may be a better fit.

  • Brown Foundation Repair Dallas, TX
  • Brown Foundation Repair wrote the book on foundation repair. We specialize in quality installation and affordable services that include residential foundat... more
  • Just Posted

  • Ericsson Lewisville, TX
  • Join our Team Ericsson Inc. does not sponsor U.S work authorizations for this job position including U.S. immigration filings for initial and/or change of ... more
  • Just Posted

AI Assistant is available now!

Feel free to start your new journey!