Infrastructure Hardware Technical Program Manager (server and Network Systems)

Cerebras · Semiconductors · US and Canada Offices · Software

This role is for an Infrastructure Hardware Technical Program Manager responsible for the end-to-end delivery of server and network platform programs for Cerebras CS-3-based AI clusters. The role requires technical understanding of server and network systems, program management skills, and experience coordinating with vendors and internal teams. While the company builds AI hardware and infrastructure, the role itself is focused on the hardware program management rather than direct AI/ML model development or research.

What you'd actually do

  1. Own end-to-end program execution for server systems and network equipment in Cerebras clusters, including new platforms, refreshes, and major component/config changes.
  2. Drive requirements gathering and convert inputs into executable plans with clear milestones, readiness gates, and cross-functional deliverables.
  3. Represent Cluster Architecture in executive reviews, OKR cycles, and leadership/customer forums as needed.
  4. Build and manage integrated schedules across vendors and internal teams, track dependencies, critical path, and risks.
  5. Manage OEM/ODM and switch/vendor engagements (RFI/RFP, samples, escalations, roadmap alignment).

Skills

Required

  • B.S. or M.S. in Computer Science, Electrical/Computer Engineering, or equivalent experience.
  • 8+ years in Technical Program Management (or similar delivery leadership) for server, network, or infrastructure platforms from concept through production.
  • Experience coordinating complex server and/or datacenter network programs across OEM/ODMs, switch vendors, and internal engineering teams.
  • Working knowledge of server architecture (CPU/NUMA, memory bandwidth, PCIe, NIC and storage IO) and enough networking fundamentals (leaf-spine fabrics, switch platforms, high-performance interconnects) to run effective technical reviews.
  • Familiarity with Linux server fleet management (provisioning, firmware/BIOS, drivers, field triage).
  • Strong multi-team program execution skills: integrated plans, risk management, dependency tracking, and executive-level communication.
  • Ability to operate in ambiguity and keep parallel server and network workstreams aligned.

Nice to have

  • Experience with AI/ML, HPC, or performance-sensitive distributed infrastructure is a plus.

What the JD emphasized

  • server and network platform programs
  • server and network systems
  • server architecture
  • networking fundamentals
  • server and network workstreams