Senior Software Engineer, AI Agent Runtime and Open Source Infrastructure

NVIDIA · Semiconductors · Santa Clara, CA +6

Senior Software Engineer to build and implement production-grade features for AI agent runtime and open-source infrastructure, focusing on onboarding, policy controls, inference routing, and sandbox lifecycle. Develop secure agent runtime infrastructure, engage in daily open-source workflows, and diagnose complex failures across various platforms.

What you'd actually do

  1. Build and implement production-grade features across NemoClaw, focusing on onboarding flows, policy controls, inference routing, and sandbox lifecycle.
  2. Develop and sustain secure agent runtime infrastructure, ensuring strong network policy administration, credential management, and failure recovery.
  3. Engage in daily open-source workflows: author pull requests, conduct technical reviews, address issues, write tests, and contribute to documentation.
  4. Use AI-assisted development tools to improve the engineering loop, while applying rigorous verification and security measures.
  5. Develop tools, test harnesses, automation scripts, and CI/CD workflows to boost team efficiency.

Skills

Required

  • BS, MS, or equivalent experience in Computer Science, Software Engineering, or a related technical field.
  • Over 12+ years of experience in developing and managing production software systems, developer infrastructure, or open-source platforms.
  • Strong systems engineering fundamentals with a proven track record of solving multifaceted problems.
  • Skilled in at least one prominent programming language and capable of rapidly learning TypeScript, JavaScript, Node.js, and Rust.
  • Comfort working in large codebases, with experience in reading unfamiliar code, conducting detailed reviews, and improving maintainability.
  • Demonstrated experience with open-source practices, including managing tasks, pull requests, code reviews, and public technical discussions.
  • Experience with AI-supported development tools and a solid understanding of validating generated code.
  • Security-conscious engineering approaches, particularly concerning secrets management, sandboxing, and network policy enforcement.
  • Solid testing, continuous integration and delivery, and debugging abilities, with the capability to replicate failures, determine root causes, and clearly convey results.
  • Excellent written and verbal communication skills, capable of explaining technical concepts to diverse audiences.

Nice to have

  • Contributions to open-source developer infrastructure, AI tooling, or large public software projects.
  • Hands-on experience with AI coding agents, workflow automation, or multi-agent systems.
  • Experience with containers and Linux isolation technologies including Docker, Kubernetes, and network policy management.
  • Demonstrated experience in developing dependable CI, comprehensive validation, and test infrastructure for dynamic software.
  • Familiarity with LLM inference, GPU-backed workloads, or performance-sensitive AI infrastructure as well as demonstrated ability to elevate the engineering standards through thoughtful reviews, clear documentation, and effective mentoring.

What the JD emphasized

  • 12+ years of experience in developing and managing production software systems, developer infrastructure, or open-source platforms.
  • Strong systems engineering fundamentals with a proven track record of solving multifaceted problems.
  • Security-conscious engineering approaches, particularly concerning secrets management, sandboxing, and network policy enforcement.

Other signals

  • agentic AI
  • developer infrastructure
  • runtime security
  • open-source stacks
  • AI engineering practices