Software Development Engineer I, ML Infra Services, Annapurna Labs

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Software Development Engineer to build and evolve machine learning infra services for custom AI accelerators (AWS Neuron). Role focuses on tooling for profiling, optimization, and resource management of ML workloads, working at the intersection of Kubernetes, custom silicon, and large-scale ML workloads.

What you'd actually do

  1. Design and implement tooling for profiling, optimization, and resource management of ML workloads on custom accelerators.
  2. Build high-impact solutions that ship to a large and growing customer base.
  3. Participate in design discussions, code reviews, and cross-functional collaboration with hardware, software, and customer-facing teams.
  4. Create metrics, implement automation, and resolve root causes of software defects.
  5. Work in a startup-like environment where you're always focused on the most important problems.

Skills

Required

  • Experience with at least one modern language such as Java, Python, C++, or C# including object-oriented design
  • Experience with at least one general-purpose programming language such as Java, Python, C++, C#, Go, Rust, or TypeScript
  • Experience with data structure implementation, basic algorithm development, and/or object-oriented design principles
  • Proficiency in Java and at least one of Go, Python, or TypeScript.
  • Familiarity with Git and CI/CD pipelines.

Nice to have

  • Experience from a technical internship
  • Experience in optimization mathematics such as linear programming and nonlinear optimization
  • Experience with distributed, multi-tiered systems, algorithms, and relational databases
  • Experience with Cloud platforms (preferably AWS), database systems (SQL and NoSQL), AI tools for development productivity, contributing to open-source projects, and/or version control systems
  • Internship or project experience with AWS services (EKS, EC2, Lambda, S3, DynamoDB, or SQS).
  • Familiarity with distributed systems or big data architectures.
  • Experience with Linux systems and performance profiling.
  • Exposure to compiler toolchains, code generation, or instruction set architectures (CPU, NPU, GPU).

What the JD emphasized

  • custom AI accelerators
  • ML workloads
  • custom silicon

Other signals

  • custom AI accelerators
  • ML workloads
  • inference
  • optimization
  • infrastructure orchestration