Data Center Engineer

AMD AMD · Semiconductors · San Jose, CA · Engineering

This role focuses on the installation, validation, and technical support of AMD's Datacenter graphics hardware/software subsystem projects, specifically for AI and Machine Learning workloads, for OEM partners and enterprise customers. The engineer will build datacenter GPU dockers and containers, qualify new software functionality, and assist in resolving hardware/software technical issues throughout the product lifecycle.

What you'd actually do

  1. Perform node and cluster-level software installation and validation for GPU/compute AI and Machine Learning projects.
  2. Resolve technical issues for customers utilizing AMD Instinct™ products.
  3. Provide technical guidance and support to customers for server graphics and compute projects related to AI and Machine Learning workloads.
  4. Build datacenter GPU dockers and containers for customer testing and deployment.
  5. Qualify and assess new software functionality to ensure compatibility with customer requirements.

Skills

Required

  • Datacenter customer support roles
  • Large-scale cluster deployment within hyperscale datacenters
  • Server architecture and functionality
  • Linux installation, setup, usage, and debugging
  • Virtual environments (e.g., VMWare, Citrix, KVM, Microsoft)
  • Datacenter GPU software stacks such as AMD ROCm™ or Nvidia CUDA
  • Validating multimode AI clusters using AMD tools (e.g., AGFHC, RCCL RDMA) or equivalents
  • AI/Machine Learning workloads, frameworks, and models
  • Strong debugging, problem-solving, and analytical skills
  • Excellent verbal and written communication skills

Nice to have

  • Technical certifications in relevant software systems

What the JD emphasized

  • AI GPU hardware
  • AI/Machine Learning
  • AI/Machine Learning workloads
  • AI and Machine Learning projects

Other signals

  • customer support
  • datacenter deployment
  • AI GPU hardware/software