Principal Software Engineer - Manufacturing & Factory

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +4 · Remote

NVIDIA is seeking a Principal Software Engineer to design, drive, and operationalize rack-scale factory and deployment flows for next-generation data center products. This role involves leading end-to-end factory workflows, collaborating with partners, and ensuring reliability and debuggability in tool design. The ideal candidate will have deep systems expertise and a passion for building scalable manufacturing solutions.

What you'd actually do

  1. Lead and drive rack-scale/L11 flows for factory and initial data center deployment.
  2. Design and implement end-to-end factory workflows, including firmware flashing sequences, security provisioning, and deployment of software mitigations.
  3. Collaborate with data center architects, ODMs, and OEMs to define factory and data center requirements that ensure efficient and reliable production ramp.
  4. Champion reliability, debuggability and optimization in firmware, diagnostic and deployment tool design.
  5. Drive pre-silicon readiness for factory & manufacturing workflows for rack-scale products. Using NVIDIA's industry leading simulation & emulation technology.

Skills

Required

  • System architecture and design
  • Server systems design
  • SW/HW interface
  • Networking technology & protocols
  • System software for accelerators (GPUs, DPUs, FPGAs)
  • Out-of-band and in-band management architectures
  • System management protocols (Redfish, IPMI)
  • Left shift strategy implementation

Nice to have

  • Large-scale cloud and cluster level deployment and management systems
  • Data center product lifecycle management (inception, pre-silicon, post-silicon, manufacturing, deployment)

What the JD emphasized

  • deep systems expertise
  • decisive technical leadership
  • passion for building reliable, debuggable, and scalable manufacturing and deployment solutions
  • BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience
  • 15+ years in the area of System architecture and design
  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband)
  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs
  • Expertise in out-of-band and in-band management architectures.
  • Knowledge of system management protocols such as Redfish and IPMI.
  • Demonstrable experience in implementing left shift strategy to de-risk program execution.