Senior Tools Development Engineer

NVIDIA NVIDIA · Semiconductors · Pune, India

NVIDIA is seeking a Senior Tools Development Engineer to build agentic infrastructure for test automation and quality engineering on the Omniverse platform. The role involves designing and deploying multi-agent systems, orchestration frameworks, and evaluation systems to improve software quality and reliability.

What you'd actually do

  1. Develop and deploy multi-agent systems for automated test generation, log analysis, failure triage, and bug-filing workflows
  2. Build and maintain agent orchestration frameworks using tools such as Claude Code, MCP servers, and agent SDK patterns
  3. Create autonomous pipelines that reduce cognitive load on engineers by routing failures, surfacing root causes, and generating actionable bug reports
  4. Build evaluation systems to measure agent output quality — ensuring autonomous pipelines are reliable, not just fast
  5. Establish observability and monitoring for agentic workflows so failures are transparent, debug-gable, and recoverable

Skills

Required

  • Strong Python engineering
  • Deep familiarity with AI-native development workflows — Claude Code, Cursor, LLM APIs, prompt engineering in production
  • Hands-on experience building multi-agent or autonomous systems that have shipped and run without continuous supervision
  • Clear understanding of where LLMs fail — hallucination, context degradation, tool misuse — and experience building mitigations into system design, including evaluation frameworks for AI-generated outputs
  • graduate degree in Computer Science Engineering or equivalent
  • 5+ years in test automation, CI/CD pipeline design, or software quality engineering, including failure analysis and test triage at scale
  • Ability to reason about test coverage strategically across a complex, frequently-releasing platform SDK

Nice to have

  • High agency
  • patience and communication skill to build systems that colleagues can trust and adopt
  • Intellectual honesty about where systems break, with a habit of building in recovery paths rather than hiding failures

What the JD emphasized

  • high-agency engineers
  • autonomous agents
  • agentic infrastructure
  • multi-agent systems
  • agent orchestration frameworks
  • autonomous pipelines
  • agent output quality
  • agentic workflows
  • multi-agent or autonomous systems that have shipped and run without continuous supervision
  • evaluation frameworks for AI-generated outputs

Other signals

  • building agentic infrastructure
  • design and build multi-agent systems
  • build evaluation systems to measure agent output quality
  • establish observability and monitoring for agentic workflows