Data Center Engineer

AMD AMD · Semiconductors · Austin, TX · Engineering

This role focuses on the installation, validation, and technical support of AMD's Datacenter GPU hardware and software for AI/ML projects, specifically for OEM partners and enterprise customers. The engineer will build datacenter GPU containers, qualify software functionality, and assist in resolving hardware/software issues throughout the product lifecycle, collaborating with customers and program managers.

What you'd actually do

  1. Perform node and cluster-level software installation and validation for GPU/compute AI and Machine Learning projects.
  2. Resolve technical issues for customers utilizing AMD Instinct™ products.
  3. Provide technical guidance and support to customers for server graphics and compute projects related to AI and Machine Learning workloads.
  4. Build datacenter GPU dockers and containers for customer testing and deployment.
  5. Qualify and assess new software functionality to ensure compatibility with customer requirements.

Skills

Required

  • Datacenter customer support roles
  • Large-scale cluster deployment within hyperscale datacenters
  • Server architecture and functionality
  • Linux installation, setup, usage, and debugging
  • Virtual environments (e.g., VMWare, Citrix, KVM, Microsoft)
  • Datacenter GPU software stacks such as AMD ROCm™ or Nvidia CUDA
  • Validating multimode AI clusters using AMD tools (e.g., AGFHC, RCCL RDMA) or equivalents
  • AI/Machine Learning workloads, frameworks, and models
  • Strong debugging, problem-solving, and analytical skills
  • Excellent verbal and written communication skills

Nice to have

  • Technical certifications in relevant software systems

What the JD emphasized

  • AI GPU hardware, software, and networking

Other signals

  • customer support
  • datacenter deployment
  • AI GPU hardware/software