Software Development Engineer Ii, Intelligent Cloud Hosting (icon)

Amazon Amazon · Big Tech · Seattle, WA · Software Development

Software Development Engineer II on the Intelligent Cloud Hosting (ICON) team at Amazon, responsible for building AI-powered incident response systems that automate incident investigation and mitigation recommendations for Amazon's cloud infrastructure. The role involves designing and building generative AI workflows, working with AWS services, and developing operational tooling at massive scale.

What you'd actually do

  1. Design and build production generative AI workflow that automate incident investigation workflows, from alert ingestion through root-cause analysis to mitigation recommendations
  2. Work on tier-1, multi-tenant, high-performance systems built on AWS services (Step Functions, Bedrock, DynamoDB, Athena) with technical challenges unique to this kind of scale and throughput
  3. Build developer productivity and operational tooling including orchestration, predictive analytics, automated diagnosis, and self-healing systems
  4. Design and build distributed systems and automation in a large-scale cloud environment that supports millions of customers globally
  5. Develop scalable services and tools on AWS that process high volumes of operational data to drive better decision-making

Skills

Required

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience programming with at least one software programming language

Nice to have

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent

What the JD emphasized

  • generative AI workflow
  • automate incident investigation workflows
  • AI-powered incident response systems
  • generative AI and machine learning

Other signals

  • AI-powered incident response systems
  • automate incident investigation workflows
  • generative AI workflow