Cloud Hardware Dev Engineer (aws Generative AI & ML Servers), Aws Generative AI & ML Servers

Amazon Amazon · Big Tech · Seattle, WA · Hardware Development

This role focuses on designing, developing, and operating AWS cloud offerings for AI/ML and HPC workloads, specifically involving accelerated servers for AI training and inference. The engineer will work with interdisciplinary teams and partners to bring these servers to data centers and oversee their performance and quality post-launch.

What you'd actually do

  1. own and lead the design, development and root cause of a new segment of accelerated servers.
  2. work closely with our customers to understand their technical needs and business goals, leveraging your experience with server design and the knowledge of various teams to architect the solutions that we will deploy at scale.
  3. work with an interdisciplinary team of component, firmware, test, qualification, and integration engineers, and lead our design and manufacturing partners to bring these servers to the data center.
  4. oversee the fleet of servers you develop, monitoring their quality and how they are meeting the customer requirements.
  5. interfacing with our internal and external customers to understand project requirements and facilitate system development ontop of your server design.

Skills

Required

  • Experience working with interdisciplinary teams to execute product design from concept to production
  • Experience in developing functional specifications, design verification plans and functional test procedures
  • Experience in server technologies such as, thermal, mechanical, power, and signal integrity
  • Bachelor's Degree in EE or equivalent

Nice to have

  • Master's degree or above in electrical engineering, computer engineering, or equivalent
  • Experience with the project management of technical projects
  • Experience in compute and storage server architecture and design for large scale applications
  • 5+ years hardware development with a focus on system / server development in compute and/or storage server architecture and design for large scale applications

What the JD emphasized

  • accelerated servers
  • AI training and inference
  • Generative AI cloud
  • AI/ML and HPC workloads