Senior Hardware Engineer - GPU & AI Infrastructure

Roblox Roblox · Consumer · San Mateo, CA · Engineering Operations

Senior Hardware Engineer focused on GPU and AI infrastructure, responsible for the full lifecycle of GPU hardware, from architectural evaluation to fleet integration and performance tuning, ensuring optimal performance for rendering and ML workloads.

What you'd actually do

  1. Architect & Prototype: Prototype next-generation GPU-accelerated hardware platforms, ensuring seamless integration between high-density compute nodes, high-speed interconnects (NVLink/PCIe Gen5/6), and system firmware.
  2. GPU Optimization: Drive the integration, performance testing, and debugging of GPUs in our fleet, focusing specifically on hardware-level optimizations, driver tuning, and thermal/power management.
  3. Validation & Certification: Develop and execute rigorous evaluation and stress-testing strategies for GPU-heavy server platforms to ensure they meet Roblox’s unique demands for real-time rendering and low-latency AI inference.
  4. Firmware & Systems: Lead firmware qualification (BIOS/BMC) and troubleshooting, implementing automation systems to manage GPU health, firmware updates.
  5. Vendor Collaboration: Provide technical guidance and deep-dive feedback to hardware vendors. Lead critical investigations into component-level failures, triaging issues across the hardware, driver, and kernel layers.

Skills

Required

  • GPU architecture
  • AI accelerators
  • high-performance compute (HPC) systems
  • PCIe fabric
  • NVLink
  • InfiniBand
  • liquid cooling systems
  • testing and validating CPU, Memory (HBM/DDR5), Storage (NVMe), and high-speed networking subsystems
  • Linux environment
  • Python
  • Go
  • C++
  • hardware validation tools
  • automation scripts
  • debugging complex server issues remotely
  • kernel logs
  • hardware registers
  • bus-level captures

Nice to have

  • NVIDIA HGX/MGX platforms

What the JD emphasized

  • GPU architecture
  • AI accelerators
  • low-latency AI inference