Senior Solutions Architect, Npn

NVIDIA NVIDIA · Semiconductors · VA · Remote

This role focuses on designing, building, and operationalizing large-scale AI solutions, specifically Agentic AI, for customers through partners. It involves leveraging NVIDIA's technology stacks, including compute, networking, and software, to deploy AI factories and solve complex industry problems. The role requires strong expertise in HPC, AI clusters, and cloud-native methodologies, with a blend of technical guidance, proof-of-concept assistance, and knowledge sharing.

What you'd actually do

  1. Guiding partners in their adoption of end-to-end Agentic AI solutions, using NVIDIA's compute, networking, and software stacks.
  2. Using cloud native methodologies, low latency networks, and accelerated compute to help build modern AI factories.
  3. Delivering demos, assisting with proof-of-concepts, or writing papers and developer blogs.
  4. Collaborating with executives and engineering, we solve complex problems and help bring NVIDIA's premiere technologies to life in the cloud and in the datacenter.

Skills

Required

  • BS, MS, or PhD in Engineering, Computer Science, or a related field (or equivalent experience)
  • Established track record working with AI and HPC clusters, both on-premises and cloud based.
  • 12 plus years of proven experience with cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.
  • Hands-on experience with network, storage, cluster configuration and debugging.
  • Strong analytical and problem-solving skills
  • Ability to articulate what you know to others
  • Ability to multitask efficiently in a dynamic environment

Nice to have

  • Strong coding and debugging skills, including experience with Python, C/C++, Bash, and Linux utilities.
  • Demonstrated expertise through projects or Open Source contributions involving GPU workloads, Kubernetes, InfiniBand, Ethernet, or other areas related to high-performance clusters and hybrid cloud solutions.
  • Exhibit hands on experience with NVIDIA AI Enterprise, Base Command Manager, Run:ai and NVIDIA NIMs.
  • Willingness and ability to learn quickly and solve advanced problems.

What the JD emphasized

  • end-to-end Agentic AI solutions
  • AI solutions at scale
  • solve complex problems
  • advanced problems

Other signals

  • designing, building, and maintaining large scale HPC and AI hybrid computing solutions
  • deploy and operationalize AI solutions at scale
  • guiding partners in their adoption of end-to-end Agentic AI solutions
  • using cloud native methodologies, low latency networks, and accelerated compute to help build modern AI factories