Recent Searches

You haven't searched anything yet.

3 Hardware Engineer, GPU Infrastructure Jobs in Roseland, NJ

SET JOB ALERT
Details...
CoreWeave
Roseland, NJ | Full Time
$96k-113k (estimate)
7 Days Ago
CoreWeave
Roseland, NJ | Full Time
$96k-113k (estimate)
7 Days Ago
CoreWeave
Roseland, NJ | Full Time
$120k-142k (estimate)
3 Weeks Ago
Hardware Engineer, GPU Infrastructure
CoreWeave Roseland, NJ
$120k-142k (estimate)
Full Time 3 Weeks Ago
Save

CoreWeave is Hiring a Remote Hardware Engineer, GPU Infrastructure

CoreWeave is seeking a highly skilled and motivated Infrastructure/Hardware Engineer, focusing on GPU and PCIe troubleshooting, to join our Hardware Engineering team, reporting to the Director of Compute Architecture. In this role, you will play a crucial part in the design, development, troubleshooting, and optimization of our server hardware infrastructure. You will collaborate closely with cross-functional teams, external vendors, and stakeholders to ensure the successful delivery of highly performant and reliable hardware solutions.

Responsibilities:

  • Troubleshoot complex GPU and PCIe related failures
  • Partner with external vendors on failure analysis
  • Track component RMAs
  • Develop and maintain hardware/firmware management services.
  • Automate all aspects of the server hardware lifecycle.
  • Serve as the senior point of contact for hardware escalation and troubleshooting.
  • Collaborate with cross-functional teams to define hardware requirements, specifications, and system architecture.
  • Create and maintain accurate documentation of hardware designs, specifications, test procedures, and results.
  • Analyze and optimize the performance of hardware systems, identify bottlenecks, and propose improvements for enhanced efficiency.
  • Establish processes for internal hardware testing, deployment, and performance optimization.

The ideal candidate will have at least 2 years professional experience with the following:

  • Prior experience supporting and troubleshooting data center class GPUs (preferably A100 or newer)
  • Proficiency in ansible/python and experience with programmatically interacting with server BMCs, using IPMI or Redfish (preferably Redfish).
  • Experience using, integrating and automating data center class GPU diagnostics and troubleshooting tools
  • In-depth knowledge of server hardware, components, and management technologies, particularly GPUs and PCIe devices.
  • Proven ability to stay updated with the latest industry technologies and trends.
  • Previous experience collaborating with hardware vendors.
  • Strong passion for automation, with a commitment to automating processes comprehensively.
  • Excellent documentation skills and attention to detail.
  • Strong analytical and problem-solving abilities.

Hybrid Workplace

Successful candidates will be expected to attend onboarding training at our NJ Headquarters within their first several weeks of employment, with subsequent quarterly travel requirements of 1 week duration.

If you reside within a 30-mile radius of our New Jersey, New York, or Philadelphia offices, we're excited for you to join us at the office at least three times a week, recognizing the significance we place on fostering connections, collaboration, and creativity within our office culture. Our commitment to operating as a hybrid workplace underscores our dedication to enabling our employees to tailor their work-life balance to their individual preferences.

Job Summary

JOB TYPE

Full Time

SALARY

$120k-142k (estimate)

POST DATE

05/18/2024

EXPIRATION DATE

07/17/2024

WEBSITE

coreweave.com

HEADQUARTERS

New York, NY

Show more

CoreWeave
Full Time
$125k-153k (estimate)
2 Days Ago
CoreWeave
Full Time
$133k-163k (estimate)
2 Days Ago
CoreWeave
Full Time
$149k-184k (estimate)
4 Days Ago