Software Development Engineer Ii, Items and Relationships Platform

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Software Development Engineer II role focused on building and optimizing GenAI serving systems and ML platforms at massive scale. The role involves working with LLMs, VLMs, and multimodal foundation models, including optimized model serving, distillation, quantization, distributed inference, vector indices, and agentic systems. The primary focus is on the engineering and infrastructure aspects of bringing AI models to production, with a secondary involvement in agentic systems.

What you'd actually do

  1. Build and optimize GenAI serving systems at massive scale—cascaded inference with intelligent model routing, optimized LLM/VLM serving pipelines, and inference optimization techniques that achieve order-of-magnitude cost reductions while processing millions of daily submissions across billions of products
  2. Build ML platforms and agentic systems that power the full experiment-to-production lifecycle—automated training pipelines, intelligent data curation, continuous model improvement, evaluation frameworks, and CI/CD for all model workflows—dramatically accelerating how fast research ideas become production systems
  3. Architect reliable distributed systems from scratch within Amazon's ecosystem—high availability, low latency, and operational excellence across hundreds of millions of daily transactions
  4. Partner with applied scientists to productionize research—bridging the gap between experimental models and robust, maintainable production infrastructure
  5. Generate intellectual property through patents and publications—contributing novel systems designs, serving optimization techniques, and agentic architectures to the broader ML engineering community

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience programming with at least one software programming language

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent
  • Experience building complex software systems that have been successfully delivered to customers, or experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution
  • Experience with vLLM, SGLang, TensorRT or similar platforms in production environments, or experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution
  • Experience with large-scale data systems, vector databases, approximate nearest neighbor search
  • Experience building CI/CD pipelines, workflow orchestration, automation frameworks for ML workflows

What the JD emphasized

  • optimizing LLM/VLM serving for latency and cost at massive scale
  • designing agentic systems that autonomously reason over complex product data
  • building the automated pipelines that continuously integrate, test, and deploy models into production
  • productionize research
  • serving optimization techniques
  • agentic architectures

Other signals

  • ML engineering
  • GenAI
  • LLMs
  • VLMs
  • multimodal foundation models
  • serving infrastructure
  • ML platforms
  • optimized model serving
  • distillation
  • quantization
  • distributed inference
  • vector indices
  • agentic systems
  • data curation
  • training
  • evaluation
  • latency
  • cost
  • distributed systems
  • productionize research
  • CI/CD for model workflows