Networking Operating System Firmware Engineer

OpenAI OpenAI · AI Frontier · San Francisco, CA · Scaling

Seeking a Networking Operating System Firmware Engineer to build and maintain custom NOS images from scratch for OpenAI's AI supercomputers. This role involves working with open-source networking components, the Linux kernel, switch ASICs, and platform drivers to ensure the reliability and performance of the switching layer.

What you'd actually do

  1. Design, develop, and maintain custom NOS images for large-scale AI fabrics, using open source components from SONiC, FRR, and related networking stacks.
  2. Integrate, build and configure Linux kernel components, device drivers, switch ASIC SDKs, and SAI layers.
  3. Bring up new switch platforms, including thermal and fan control, power monitoring, transceiver management, watchdogs, OSFP CMIS, LEDs, CPLDs, and board-specific platform logic.
  4. Extend and customize NOS services for routing, telemetry, control-plane state, and distributed automation.
  5. Implement and debug route, neighbor, next-hop, and ECMP programming flows from control-plane intent through ASIC hardware state.

Skills

Required

  • Networking
  • NOS internals
  • Switch hardware
  • Production systems
  • Linux kernel
  • Device drivers
  • Switch ASIC SDKs
  • SAI implementations
  • C/C++
  • Python
  • Go
  • Rust
  • L2/L3 forwarding
  • ECMP
  • RoCE
  • BGP
  • QoS
  • PFC
  • Buffer tuning
  • Telemetry
  • Platform bring-up
  • Board-level debugging
  • OpenConfig gNMI
  • YANG data models
  • CI/CD pipelines
  • Distributed config and state management
  • Reproducible builds
  • Large-scale automation

Nice to have

  • SONiC
  • FBOSS
  • Cumulus Linux
  • Arista EOS
  • Junos PFE-level integration
  • hwmon
  • I2C/SMBus
  • CPLDs
  • OSFP CMIS
  • OpenConfig gNMI
  • YANG data models
  • Rust
  • Go

What the JD emphasized

  • custom NOS images from scratch
  • work across the Linux kernel, switch ASIC SAI/SDKs, platform drivers, control-plane services, and orchestration layers
  • deep understanding of networking, NOS internals, switch hardware, and production systems
  • design, implement, test, and debug production NOS software across platform drivers, routing and control-plane state, ASIC programming, observability, and fleet integration
  • work through ambiguous, open-ended technical problems and drive feature development across software, hardware, and vendor boundaries
  • Proven experience working with SONiC or comparable NOS stacks
  • Experience with Linux kernel internals, network device drivers, platform drivers, hwmon, I2C/SMBus, CPLDs, or board-level platform software
  • Experience integrating or debugging Broadcom, Marvell, NVIDIA, Intel, or comparable switch ASIC SDKs and SAI implementations
  • Ability to independently drive ambiguous NOS or platform feature development from problem definition through implementation, validation, rollout, and debugging across software, hardware, and vendor boundaries