Systems Development Engineer (aws Generative AI & ML Servers), Aws Hw Engineering

Amazon Amazon · Big Tech · Austin, TX · Systems, Quality, & Security Engineering

This role focuses on building and operating AWS cloud infrastructure specifically for AI training and inference workloads. The Systems Development Engineer will design, deliver, and operate server solutions that enable high performance and scalability for AI/ML and HPC. The role involves creating automation through agentic workflows and implementing AI-driven tools, impacting both AI implementation and core architecture within the AWS Hardware Engineering team.

What you'd actually do

  1. You will be a technical leader solving complex architectural problems which may not defined before hand.
  2. You will be owning the teams systems and work proactively in identifying deficiencies, writing tactical code to solve issues before they impact customers, and working with your team to scale the solution.
  3. You will decompose big difficult server system testability, reliability and diagnosis problems into straightforward tasks, components or features that you will lead to deliver yourself and through others in parallel.
  4. You will use combination of hardware, software, system designs, x86 architecture, processes, diagnosis and operations knowledge.
  5. In this role you will create automation through agentic workflows.
  6. You’ll develop smart automation solutions, implement AI-driven tools and workflows and be part of AI transformation.

Skills

Required

  • Systems development
  • Hardware engineering
  • Software engineering
  • Cloud infrastructure
  • AI/ML workloads
  • HPC workloads
  • Server design
  • Systems debugging
  • Automation
  • Agentic workflows
  • x86 architecture
  • Technical leadership
  • Problem-solving

Nice to have

  • Generative AI
  • LLMs
  • AWS services
  • Instance types

What the JD emphasized

  • builders
  • builder
  • builders

Other signals

  • building the backbone of Generative AI cloud
  • designing, delivering and operating AWS cloud offerings that enable high performance and scalability in AI/ML and HPC workloads
  • direct impact on AI-powered innovation
  • implementing automation solutions that directly enhance the productivity of our engineers
  • influence both AI implementation and core architecture
  • creating automation through agentic workflows
  • implement AI-driven tools and workflows
  • part of AI transformation
  • AWS Accelerated server solutions for AWS Cloud