Product Test Engineer - Machine Learning Hardware, Rrl Technical Engineering, Rrl Technical Engineering

Amazon Amazon · Big Tech · Florence, KY · Project/Program/Product Management--Technical

This role focuses on developing system-level test solutions for ML acceleration hardware used in Amazon's global server fleet. The engineer will design and implement test strategies, create scalable test infrastructure, and collaborate with hardware and software teams to ensure product reliability and efficiency in data center environments. The role involves debugging complex hardware/software interactions and optimizing test workflows.

What you'd actually do

  1. Design and implement system-level test strategies for ML acceleration products
  2. Develop comprehensive functional and performance tests for complete ML systems
  3. Create and maintain scalable test infrastructure for high-volume product validation
  4. Implement product bring-up and first-boot test procedures
  5. Drive improvements in test coverage, product quality, and manufacturing efficiency

Skills

Required

  • site reliability engineering (SRE)
  • systems engineering
  • systems administration
  • DevOps
  • security administration
  • network administration
  • Linux
  • Python
  • Java
  • Perl
  • PHP
  • Ruby
  • Bash
  • Shell

Nice to have

  • TCP/IP
  • networking protocols
  • HTTP
  • DNS
  • scripting automation
  • 24/7 production environment
  • service-oriented architecture
  • web services

What the JD emphasized

  • ML acceleration hardware
  • system-level test strategies
  • ML workloads
  • complex hardware/software interactions