Senior Software Engineer, Platform Engineering

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior Software Engineer to build next-generation AI platforms and products, focusing on agentic AI systems, RAG, and scalable infrastructure for enterprise workflows.

What you'd actually do

  1. Integrate agents with enterprise data sources, APIs, and internal microservices to enable real-world actions
  2. Architect and build agentic AI systems that leverage LLMs for reasoning, planning, and tool orchestration across enterprise workflows
  3. Develop scalable platforms for retrieval-augmented generation (RAG), long-term memory, and contextual reasoning
  4. Build reusable infrastructure for agent orchestration, tool integration, evaluation, and observability

Skills

Required

  • Python
  • designing and deploying LLM-powered systems
  • RAG
  • tool use
  • agent-based architectures
  • agent frameworks and orchestration tools (e.g., LangChain or similar)
  • high reliability, scalability, and performance
  • Kubernetes
  • cloud and hybrid environments
  • modern software stacks
  • leading complex technical initiatives
  • mentoring high-performing teams

Nice to have

  • IT Service Management (ITSM) and ServiceNow (HR, Security, Finance modules preferred)
  • enhancing enterprise efficiency and employee experience through the effective use of Generative AI based solutions
  • Cloud Platforms
  • Docker

What the JD emphasized

  • 8+ years of experience building large-scale distributed systems and cloud-native applications
  • Strong experience designing and deploying LLM-powered systems, including RAG, tool use, and agent-based architectures
  • Deep understanding of agentic AI paradigms (planning, memory, tool invocation, multi-step reasoning)
  • Proven track record leading complex technical initiatives and mentoring high-performing teams

Other signals

  • building AI platforms and products
  • familiar with concepts of RAG, agentic AI
  • architect and build agentic AI systems
  • develop scalable platforms for retrieval-augmented generation (RAG)
  • build reusable infrastructure for agent orchestration, tool integration, evaluation, and observability
  • deploying LLM-powered systems, including RAG, tool use, and agent-based architectures