Solutions Architect, Generative AI

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is seeking an AI Engineer or Solutions Architect to enable ecosystem partners for Generative AI. The role involves building innovative proof-of-concept solutions and reference architectures for AI agents, demonstrating NVIDIA's full-stack accelerated Generative AI platforms. Responsibilities include acting as a technical expert, developing foundational solutions, providing technical blueprints, advising on deployment, and enabling partners to build their own services and products. The role requires experience in deploying AI models at scale, building enterprise-grade agentic AI systems, and proficiency in LLM/VLM frameworks and Python/C++.

What you'd actually do

  1. Building an end-to-end agentic AI applications that solve real-world enterprise problems across various industries.
  2. Serve as the primary technical domain expert for pre- and post-sale for partners, embedding deeply with them to design and deploy Generative AI solutions at scale. Maintain strong relationships with leadership and technical teams to drive adoption, and successful utilization of NVIDIA GenAI platforms.
  3. Accelerate partner/customer time to value by providing repeatable reference architecture guidance, building hands-on prototypes, and advising on standard methodologies for scaling solutions to productions.
  4. Establish the scope, success metrics, and evaluation criteria for partner-led customer projects, ensuring alignment to standardized and reproducible GPU-accelerated workflows.
  5. Enable strategic partners to build their own Professional Services, platforms and products by integrating and accelerating using NVIDIA technologies for high-impact customer workloads. You will proactively find opportunities to drive deeper adoption and utilization of NVIDIA's Generative AI products.

Skills

Required

  • MS or PhD degree in Computer Science/Engineering, Machine Learning, Data Science, Electrical Engineering or a closely related field (or equivalent experience).
  • 5+ years of meaningful work experience in deploying AI models at scale as a Software Engineer or Deep Learning engineer.
  • Consistent track record of building enterprise-grade agentic AI systems using open-source models and solid foundation in deep learning, with a particular emphasis on LLM and VLM.
  • Hands-on experience with LLM and agentic frameworks (NeMo Agent Toolkit, LangChain, Semantic Kernel, Crew.ai, AutoGen) and evaluation and observability platforms. Comfortable building prototypes or proofs of concept
  • Strong coding development and proficiency in Python, C++ and Deep Learning frameworks (PyTorch, or TensorFlow).

Nice to have

  • Demonstrate expertise in building applications and systems using NeMo Framework, Nemotron, Dynamo, TensorRTLLM, NIMs, AI Blueprints. And actively contribute to the open-source community.
  • Take end-to-end ownership of projects, proactively acquiring new skills or knowledge as needed to drive success.
  • Excel in fast-paced environments, adeptly managing multiple workstreams and prioritizing for the highest customer impact.
  • Understanding of different advanced agent architectures and emerging communication protocols (MCP, OpenAI Agentic SDK, or Google A2A).
  • NVIDIA GPUs and system software stacks (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink and others.

What the JD emphasized

  • building enterprise-grade agentic AI systems
  • deploying AI models at scale
  • LLM and VLM

Other signals

  • building end-to-end agentic AI applications
  • deploying AI models at scale
  • building enterprise-grade agentic AI systems
  • accelerated computing AI