Architect, AI Solutions Engineering

NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is looking for an AI Solutions Architect to scale internal AI platforms and solutions for thousands of developers. The role involves identifying AI opportunities, setting system outcomes, optimizing performance and cost, and collaborating with AI product vendors. Requires strong experience in building large-scale distributed systems and hands-on experience with LLMs, RAG, fine-tuning, and agentic orchestration.

What you'd actually do

  1. Architect, build and enable internal AI platforms and solutions to be used by thousands of NVIDIANs worldwide.
  2. Spot opportunities where AI is the best tool: uncover gaps, and recommend AI-first approaches over conventional solutions—grounded in hands-on evaluation of modern AI-native tools.
  3. Set the north star with cross-functional teams: align on end-to-end AI system outcomes and translate them into clear, measurable objectives.
  4. Introduce technologies enabling massively parallel systems to improve turnaround time by an order of magnitude.
  5. Lead through influence: Drive, motivate, convince, and mentor sub-system owners to achieve improvements with agility, speed, and high engineering standards.

Skills

Required

  • MS/PhD in AI/CS (or equivalent experience)
  • Hands-on experience with LLMs, RAG, fine-tuning, and agentic/workflow orchestration.
  • Strong “AI-first” approach and proficiency with modern AI-native developer ecosystems and tooling.
  • Validated experience deploying to hybrid, multi-cloud environments (and ideally edge).
  • Track record architecting and shipping large-scale distributed systems in production.
  • Proven ability to find system bottlenecks and deliver measurable performance/cost improvements.
  • Strong programming skills in Java and Python
  • Validated understanding of distributed systems concepts and REST APIs.
  • Expertise with containerization and virtualization (Docker, VMs)
  • Solid understanding of cloud/platform and data infrastructure tools such as OpenStack, Kubernetes, Chef/Puppet, Hadoop/Ceph/SwiftStack, LXC, Git/Perforce, JFrog, Kafka.

Nice to have

  • Kubernetes experience is a plus
  • Depth in AI, Machine Learning and Deep Learning algorithms and techniques.
  • Strong collaborative and interpersonal skills, with a proven record of guiding and influencing others in dynamic environments.
  • Industry thought leader in AI, influenced AI ecosystem to deliver forward looking solutions
  • Background in designing high-performance, scalable software systems with a strong focus on hardware cost optimization.

What the JD emphasized

  • 12+ years building systems software
  • 2+ years building/exploring AI solutions
  • Track record architecting and shipping large-scale distributed systems in production
  • Proven ability to find system bottlenecks and deliver measurable performance/cost improvements

Other signals

  • internal AI platforms
  • AI-native developer ecosystems
  • large-scale distributed systems
  • hybrid, multi-cloud environments