Senior Hardware Systems Engineer - Lpu Platform Pathfinding

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

This role focuses on hardware systems engineering for NVIDIA's Language Processing Unit (LPU) platforms, which are designed to support demanding AI workloads. The engineer will drive hardware pathfinding, guide system-level technical decisions, and work across various teams (silicon, data center, cloud, manufacturing) to deliver production-ready systems. Responsibilities include full-stack system debug, owning interconnect and cooling architecture, and supporting program execution. The role requires broad system knowledge in electrical, mechanical, thermal, and firmware domains, with exposure to large-scale AI platforms.

What you'd actually do

  1. Drive hardware pathfinding for LPU platforms by leading early investigations into new architectures for power delivery, cooling, mechanical design, and high‑bandwidth interconnects.
  2. Guide system‑level technical decisions that influence NVIDIA’s LPU roadmap.
  3. Work with silicon, architecture, data center, cloud, product, and manufacturing teams to deliver complete, production‑ready systems.
  4. Lead full‑stack system debug, addressing electrical, mechanical, thermal, firmware, OS, and application behaviors.
  5. Support program execution, tracking issues, driving root‑cause analysis, and ensuring timely resolution.

Skills

Required

  • Bachelor's or Master's degree in Electrical Engineering, Mechanical Engineering, or related field (or equivalent experience)
  • 8+ years of proven experience leading platform, system, or hardware engineering efforts from concept through delivery
  • Broad system knowledge in electrical/power design, mechanical/thermal engineering, firmware integration, and performance tuning
  • Experience in pathfinding, evaluating early designs and maturing them into real solutions
  • Strong debugging and problem-solving skills across hardware and software boundaries
  • Clear communication and the ability to align diverse engineering groups

Nice to have

  • Direct experience designing or validating LPU, GPU, or other accelerator systems, especially at rack or cluster scale
  • Ownership of first‑of‑kind system designs, such as new power networks, liquid cooling, or optical interconnects
  • Leadership in pathfinding or advanced development programs that shaped product direction
  • Familiarity with hyperscale data center infrastructure, including cooling methods, facility power, and interconnect fabrics
  • Strength in early bring‑up, diagnostics, signal integrity, and rapid prototyping as well as participation in industry standards efforts or cross‑company technical initiatives

What the JD emphasized

  • leading early investigations
  • shape system architecture
  • solve complex challenges
  • strengthen the hardware foundation
  • leading platform, system, or hardware engineering efforts from concept through delivery
  • pathfinding, evaluating early designs and maturing them into real solutions
  • Exposure to large‑scale AI platforms
  • Direct experience designing or validating LPU, GPU, or other accelerator systems
  • Ownership of first‑of‑kind system designs
  • Leadership in pathfinding or advanced development programs