Data Center Systems Engineer

AMD AMD · Semiconductors · Austin, TX · General Management/ Administration/ Support

This role focuses on deploying, validating, and sustaining AMD's CPU and GPU platforms in large-scale data center environments. The engineer will be responsible for system-level debugging, platform reliability, and continuous improvement of tools and processes. The role involves hands-on work with hardware, firmware, and systems in a live data center setting.

What you'd actually do

  1. Manage platform deployment, availability, and stability for data center CPU/GPU systems
  2. Lead system-level debugging efforts involving hardware, firmware, Linux, networking, power, and thermal behavior
  3. Develop structured debug strategies, validation flows, and failure-analysis methodologies
  4. Track daily activities, prioritize issues, assign work, and monitor progress to resolution
  5. Collaborate with silicon, firmware, validation, networking, and operations teams to assess risks and requirements

Skills

Required

  • Systems engineering experience supporting complex CPU, GPU, or SoC-based platforms
  • Platform or system-level validation, bring-up, or design reliability experience
  • Debugging expertise across BIOS, BMC, Linux, and hardware interfaces
  • Familiarity with test automation and failure-analysis methodologies
  • Experience working with OEM, ODM, or hardware vendors
  • Knowledge of high-speed interconnects such as PCIe Gen5
  • Hands-on experience using hardware lab equipment (e.g., scopes, programmers, system bring-up tools)
  • Exposure to FPGA, CPLD, or firmware development environments

Nice to have

  • Advanced degree desired but not required