Software Development Engineer in Test, Alexa Global Quality - India

Amazon Amazon · Big Tech · IN, KA, Bengaluru · Software Development

This role focuses on building and evolving agentic automation tooling for end-to-end quality evaluation of Alexa's multi-locale experience. It involves creating scalable, AI-powered solutions for validation across speech, visual, conversational quality, and cultural/linguistic dimensions, including synthetic test generation and LLM-as-a-Judge evaluations.

What you'd actually do

  1. Building and evolving agentic tooling for end-to-end quality evaluation — from synthetic test case generation, to multimodal response validation, to automated cultural and linguistic correctness assessment
  2. Establishing a consistent, scalable testing framework that supports multi-locale validation and complements the development processes across all supported locales
  3. Creating metrics and reports on quality status, technical operations, and system performance across all locales, driving towards agreed-upon quality bars that are consistent yet culturally appropriate per market
  4. Working closely with engineers to architect and develop the best technical design and testing approach for internationalization (i18n)
  5. Working effectively with product managers, Country PMs, designers, engineering, locale experts, and business teams to deliver the best overall experience factoring in financial goals, usability goals, and locale-specific customer expectations

Skills

Required

  • 4+ years of non-internship professional software development testing experience
  • 2+ years of test automation frameworks and tools building experience
  • Experience programming with at least one modern language such as Java, C++, or C# including object-oriented design
  • Experience in penetration testing and exploitability-focused vulnerability assessment
  • Experience in platform-level security mitigations and hardening for Linux and Windows

Nice to have

  • Knowledge of overall system architecture, scalability, reliability, and performance in a database environment
  • Experience building test automation frameworks and tools

What the JD emphasized

  • multilingual, conversational CX evaluation across multimodal experiences spanning text, voice, and visual
  • agentic automation tooling
  • AI-powered automation solution for multi-locale and linguistic product validation
  • scalable testing framework that supports multi-locale validation
  • agentic tooling for end-to-end quality evaluation
  • multimodal response validation
  • automated cultural and linguistic correctness assessment
  • metrics and reports on quality status
  • quality bars that are consistent yet culturally appropriate per market

Other signals

  • AI-powered automation solution
  • agentic automation tooling
  • LLM-as-a-Judge evaluation
  • cultural relevancy classification powered by machine learning