Cloud Hardware Dev Engineer (aws Generative AI & ML Servers), Aws Generative AI & ML Servers

Amazon Amazon · Big Tech · Seattle, WA · Hardware Development

This role focuses on designing, developing, and operating cloud hardware for AWS Generative AI and ML servers. The engineer will define server architectures, create detailed designs and specifications for high-performance AI training and inference, and manage validation strategies from component bring-up to rack integration. Responsibilities include triaging hardware issues, conducting root cause analysis, overseeing server fleets, and driving continuous improvement in collaboration with internal teams and external partners.

What you'd actually do

  1. partner with EC2 to define server architectures based on workload demand and customer requirements
  2. translate these architectures into detailed supporting designs and component specifications that enable high-performance AI training and inference at scale
  3. define and drive validation strategies from PCBA bring-up through server and rack integration, ensuring designs meet performance, thermal, mechanical, power, and signal integrity requirements
  4. Working with interdisciplinary teams of component, firmware, test, qualification, and integration engineers, you will lead design and manufacturing partners through development and production
  5. responsible for triaging hardware issues at both ODM facilities and datacenters, conducting root cause analysis, and implementing corrective actions

Skills

Required

  • Experience in developing functional specifications, design verification plans and functional test procedures
  • Experience in server technologies such as, thermal, mechanical, power, and signal integrity
  • Bachelor's degree or above in electrical engineering, computer engineering, or equivalent
  • 5+ years of Design/Innovation, research & development, manufacturing, process, industrial engineering, or related experience
  • Experience leading process improvement, systems development, and project management
  • Experience in English-language communication skills, both written and verbal
  • In depth expertise in one or more server technologies such as Thermal / Mechanical design

Nice to have

  • 7+ years of equivalent experience

What the JD emphasized

  • high-performance AI training and inference at scale