Manager, Test and Tools Development Engineering

NVIDIA NVIDIA · Semiconductors · Pune, India

Manager for a test and tools development engineering team focused on building autonomous systems and AI-powered quality infrastructure for Omniverse. The role involves leading a team to design agentic test pipelines, multi-agent orchestration for test generation, failure triage, and establishing evaluation frameworks for AI-generated outputs.

What you'd actually do

  1. Building and leading a team of test and tools development engineers who design agentic test pipelines, quality infrastructure, and autonomous workflows for Omniverse
  2. Owning the technical vision for AI-powered quality infrastructure — multi-agent orchestration for test generation, failure triage, root cause analysis, and automated bug filing
  3. Setting speed-of-light goals that compel the team to rethink assumptions, not just work harder — and making architecture decisions that balance ambition with production reliability
  4. Establishing evaluation frameworks for AI-generated outputs, ensuring autonomous pipelines are reliable and observable — not just fast
  5. Partnering with QA leadership, platform teams, and product engineering to identify high-leverage automation opportunities and ensure the tools this team builds are trusted and adopted

Skills

Required

  • Python engineering skills
  • building or leading teams that shipped multi-agent, autonomous, or AI-powered systems in production
  • Familiarity with AI-native development workflows — large language model APIs, prompt engineering, and AI coding tools used in production contexts
  • test automation
  • continuous integration and delivery pipeline design
  • software quality engineering at scale
  • translating ambiguous organizational goals into clear technical direction and executable plans
  • hiring, coaching, and growing senior engineers

Nice to have

  • Active experimentation with AI agent frameworks, model context protocol servers, agent skills as a personal hobby or side project
  • Deep understanding of where large language models fail—hallucinations, context degradation, and tool misuse—and practical experience building mitigations into production systems
  • Experience crafting evaluation and benchmarking frameworks specifically for AI-generated artifacts
  • Prior work on developer platforms or tools teams where adoption and developer experience were primary success metrics

What the JD emphasized

  • hands-on engineering leader
  • deeply technical
  • building things
  • AI coding tools and agent frameworks
  • ambiguous, high-stakes technical challenges
  • shipped systems
  • clearing the path for builders
  • multi-agent, autonomous, or AI-powered systems in production
  • AI agent frameworks, model context protocol servers, agent skills as a personal hobby or side project
  • large language models fail—hallucinations, context degradation, and tool misuse
  • building mitigations into production systems
  • evaluation and benchmarking frameworks specifically for AI-generated artifacts

Other signals

  • building and leading a team
  • AI-powered quality infrastructure
  • multi-agent orchestration
  • evaluation frameworks for AI-generated outputs
  • autonomous pipelines