Senior System Integration and Validation Engineer

NVIDIA NVIDIA · Semiconductors · Bangalore, India

Senior Engineer to lead system validation, debug, and cross-functional alignment on silicon programs. Focus on building AI-enabled validation capabilities, including debug, triage, and regression workflows with guardrails. Own end-to-end system validation of GPU, SoC, and platform programs, lead debug of complex cross-stack issues, and develop test plans and automation. Design and deploy AI-enabled debug and validation workflows with measurable impact.

What you'd actually do

  1. Own end-to-end system validation of NVIDIA GPU, SoC, and platform programs — feature checks, PVT stress, system-stress campaigns at scale, and multi-unit fleet testing.
  2. Lead debug of the hardest cross-stack issues — logic, signal integrity, power delivery, firmware, and software interaction — and drive them to root cause with reusable workarounds and productized fixes.
  3. Develop test plans, scripts, and automation for next-generation chips ahead of physical builds, translating architecture, boot flows, high-speed I/O, and power and thermal dependencies into executable coverage.
  4. Design and deploy AI-enabled debug and validation workflows used by the team — with explicit guardrails (evals, regression validation, false-positive handling) and measurable impact on cycle time, debug velocity, or escape rate.
  5. Translate complex silicon and system risk into decision-ready options for technical and executive audiences — without padding or hand-waving.

Skills

Required

  • BTech/BE or MTech/ME in Electronics, Electrical, or Computer Engineering (or equivalent experience), plus 8 to 12 years of post-silicon validation, system integration, or platform debug experience on shipped GPU, CPU, or SoC products.
  • Hands-on debug depth across silicon, board, and software boundaries — logic design, signal integrity, power delivery, high-speed I/O, and PVT behavior — with strong EE fundamentals across SI/PI, power delivery, and thermal, and a working understanding of GPU, CPU, or SoC architecture across PC, Datacenter, or Automotive.
  • At least one specific example of taking an ambiguous, multi-team failure to root-cause closure with a productized fix, and a track record of leading a project team end-to-end through a real crisis with decision points you owned.
  • Demonstrated AI-driven validation workflow you built or scaled, not just used — with adoption beyond yourself and measurable impact on debug velocity, coverage, or escape rate. You can describe the guardrails you put in place and where AI is dangerous in your workflow.

Nice to have

  • A history of building reusable validation methodology, debug playbooks, or test frameworks that were adopted by other programs or teams — backed by patents, conference papers, invited talks, or recognized contributions.
  • Hands-on subsystem depth in HBM, SerDes or high-speed I/O, power and thermal, or advanced packaging — failure modes, debug instrumentation, and the tradeoffs that show up in production.
  • Experience partnering deeply with a counterpart team in India or another major engineering hub — examples of shared culture, shared metrics, and shared on-call across geographies.
  • AI work that goes beyond personal-copilot use — agentic workflows, RAG-grounded debug assistants, regression bucketing, or automated triage deployed at team scope with adoption metrics.

What the JD emphasized

  • critical silicon and platform issues before they reach the customer
  • Build AI-enabled validation as a real capability, not a demo
  • explicit guardrails (evals, regression validation, false-positive handling)
  • Demonstrated AI-driven validation workflow you built or scaled, not just used
  • adoption beyond yourself and measurable impact on debug velocity, coverage, or escape rate
  • explicit guardrails you put in place and where AI is dangerous in your workflow

Other signals

  • AI-enabled validation
  • AI-driven validation workflow
  • debug, triage, root-cause hypothesis generation, and regression workflows with the evals and guardrails