Senior Solutions Architect, Generative AI Specialist

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

Senior Solutions Architect specializing in Generative AI, focusing on building and deploying enterprise-grade agentic AI systems, RAG pipelines, and multi-modal workflows with GPU-accelerated inference at scale. The role involves acting as a technical advisor, leading prototyping, architecting solutions, and resolving complex system issues for NVIDIA's advanced AI partners.

What you'd actually do

  1. Function as the chief technical representative for advanced AI partners, handling detailed technical interactions and synchronizing their roadmaps with NVIDIA's platform and emerging technology.
  2. Lead end-to-end prototyping engagements by defining requirements and crafting reference architectures that guide partners from initial concept to production-ready handoffs.
  3. Architect and build enterprise-grade agentic AI systems, RAG pipelines, and multi-modal workflows, ensuring high-performance, GPU-accelerated inference at scale.
  4. Diagnose and resolve intricate system issues and performance bottlenecks across the full AI stack, from initial model selection to large-scale deployment.
  5. Produce reusable technical assets, including implementation guides and benchmarks, that accelerate time-to-value for both partners and their customers.

Skills

Required

  • Advanced GenAI and LLM approaches
  • Proprietary vs. open model selection
  • Retrieval-augmented generation (RAG)
  • Prompt engineering
  • Production inference optimization
  • Agentic AI standards (LangGraph, LangChain, MCP)
  • Multimodal AI systems
  • Vision-language models
  • Audio/video workflows
  • Fine-tuning (PEFT, LoRA)
  • Synthetic data generation
  • Automated AI evaluation frameworks
  • GPU-optimized infrastructure (Docker, Kubernetes)
  • Microservices
  • MLOps pipelines
  • AI observability (trace logging, latency monitoring)
  • Responsible AI
  • Guardrails
  • Data governance
  • Regulatory considerations (HIPAA, GDPR)
  • Cloud and on-premises deployments
  • Hybrid architectures
  • Physical AI concepts
  • Simulation environments
  • Edge inference
  • Communication of complex technical concepts
  • Technical engagement roles
  • Discovery and requirements gathering
  • MS or advanced degree in Computer Science, AI, or related field, or equivalent experience

Nice to have

  • NVIDIA AI Enterprise (NVAIE)
  • NIM inference microservices
  • NeMo
  • NeMo Curator
  • Nemotron model family
  • NVIDIA Blueprints
  • TensorRT
  • Triton Inference Server
  • CUDA libraries
  • NVIDIA Agent Intelligence toolkit (NAI/NAT)
  • NemoClaw/OpenShell
  • NVIDIA Omniverse
  • Isaac Sim
  • Jetson edge computing
  • Production robotics/autonomous systems implementations
  • Open-source contributions
  • Publications
  • Patents
  • Recorded talks
  • Technical blogs

What the JD emphasized

  • 8+ years of relevant engineering experience crafting, developing, and deploying AI/ML systems and complex LLM-powered workflows, with a strong software or ML engineering foundation.
  • Strong proficiency in advanced GenAI and LLM approaches, including proprietary vs. open model selection, retrieval-augmented generation (RAG), prompt engineering, and production inference optimization.
  • Hands-on experience with fine-tuning (PEFT, LoRA), synthetic data generation, and implementing automated AI evaluation frameworks to measure quality and safety.
  • Experience with AI observability (trace logging, latency monitoring) and a solid understanding of responsible AI, including guardrails, data governance, and regulatory considerations (HIPAA, GDPR).

Other signals

  • building enterprise-grade agentic AI systems
  • high-performance, GPU-accelerated inference at scale
  • accelerate the transition from concept to production