Senior Product Architect, Storage

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1 · Remote

NVIDIA is seeking a Senior Product Architect to design and validate AI storage infrastructure, focusing on optimizing systems for large-scale foundation model training, disaggregated inference, and agentic AI pipelines. The role involves architecting end-to-end reference architectures, defining system-level architectures, and collaborating with partners and customers to deliver proof-of-concepts.

What you'd actually do

  1. Architect end-to-end reference architectures for disaggregated inference (aligned with NVIDIA Dynamo), large-scale foundation model training, and agentic AI pipelines — co-developed with storage and ecosystem partners.
  2. Design and validate storage-optimized AI infrastructure, including KV Cache tiering strategies, checkpoint acceleration, and high-throughput dataset pipelines that leverage RDMA and NVMeoF fabrics.
  3. Define system-level architectures spanning Rubin graphics processors, Vera central processing units, BlueField data processing units, NVLink interconnects, and Spectrum-X Ethernet to improve efficiency across the full AI lifecycle.
  4. Develop and publish reference architectures, whitepapers, and deployment guides for the NVIDIA AI Data Platform and partner-integrated solutions.
  5. Drive prototyping, benchmarking, and performance validation of AI infrastructure at scale - diagnosing bottlenecks across compute, networking, and storage layers.

Skills

Required

  • Datacenter-scale AI, HPC, or storage infrastructure architecture
  • Disaggregated inference architectures
  • LLM training pipelines
  • Autonomous AI system patterns
  • RDMA (RoCEv2/InfiniBand)
  • High-performance storage protocols (NVMeoF, GPFS, Lustre, S3-compatible object storage)
  • Low-latency fabric design
  • KV Cache management strategies
  • Tiered memory/storage hierarchies for inference optimization
  • Retrieval-Augmented Generation (RAG) architectures
  • NVIDIA DOCA or equivalent DPU/SmartNIC programming frameworks
  • Spectrum-X Ethernet
  • InfiniBand
  • NVLink Switch fabrics
  • Congestion control
  • Datacenter topologies

Nice to have

  • Reference architecture co-development with storage/infrastructure OEM partners
  • Hands-on deployment experience with disaggregated inference systems
  • Deep familiarity with NVIDIA Grace-Hopper, Grace-Blackwell, or Vera-Rubin platforms

What the JD emphasized

  • 12+ years of experience architecting datacenter-scale AI, HPC, or storage infrastructure as a Principal Architect, Solutions Architect, Principal Engineer, or equivalent.
  • Deep expertise in AI infrastructure build, including disaggregated inference architectures, LLM training pipelines, and autonomous AI system patterns.
  • Hands-on experience with RDMA (RoCEv2/InfiniBand), high-performance storage protocols (NVMeoF, GPFS, Lustre, or S3-compatible object storage), and low-latency fabric design.
  • Strong understanding of KV Cache management strategies, including tiered memory/storage hierarchies for inference optimization.
  • Familiarity with Retrieval-Augmented Generation (RAG) architectures and the storage, indexing, and retrieval patterns they demand at scale.
  • Experience with NVIDIA DOCA or equivalent DPU/SmartNIC programming frameworks for offloading data plane and storage services.
  • Proven foundation in networking: Spectrum-X Ethernet, InfiniBand, NVLink Switch fabrics, congestion control, and datacenter topologies.

Other signals

  • AI infrastructure
  • LLM training
  • agentic AI pipelines
  • inference optimization