Software Development Engineer - Silicon Development Infrastructure

Amazon Amazon · Big Tech · Austin, TX · Software Development

This role focuses on building and operating infrastructure for silicon development, including cloud infrastructure, HPC clusters, and automation tooling. While it mentions "Machine Learning Accelerators" and working with teams developing custom silicon for AWS, the core responsibility is infrastructure engineering, not direct AI/ML model development or research. The role supports AI hardware development but is not an AI/ML role itself.

What you'd actually do

  1. Partner with silicon design, verification, emulation, and software teams to understand their development workflows, pain points, and iteration cycles.
  2. Build tooling and automation that eliminates manual toil and reduces time-to-results.
  3. Design, implement, and operate cloud infrastructure and high-performance computing clusters using schedulers like Slurm.
  4. Develop monitoring, diagnostics, and alerting systems that surface actionable insights on efficiency, utilization, reliability, and cost trends.
  5. Take ownership of platform reliability, performance, and cost efficiency from initial design through production operation.

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience programming with at least one software programming language
  • 3+ years of administrative experience in networking, storage systems, operating systems and hands-on systems engineering experience
  • Knowledge of systems engineering fundamentals (networking, storage, operating systems)
  • Experience programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby
  • Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust with demonstrated ability to write production-quality, maintainable code
  • Experience utilizing AWS cloud solutions in a DevOps environment with infrastructure as code (CloudFormation, Terraform, CDK)
  • Experience with Linux/Unix
  • Experience in automating, deploying, and supporting large-scale infrastructure
  • Experience with high-performance computing (HPC) clusters using workload schedulers like Slurm
  • Familiarity with semiconductor developm