Senior Solutions Architect, Generative AI Deployment and Aiops

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +5 · Remote

Senior Solutions Architect focused on deploying Generative AI and LLMs, optimizing inference performance on Kubernetes, and collaborating with customers and internal teams on NVIDIA's AI platforms.

What you'd actually do

  1. Partnering with other solution architects, engineering, product and business teams. Understanding their strategies and technical needs and helping define high-value solutions
  2. Dynamically engaging with developers, scientific researchers, and data scientists, gaining experience across a range of technical areas
  3. Strategically partnering with lighthouse customers and industry-specific solution partners targeting our computing platform
  4. Working closely with customers to help them adopt and build creative solutions using NVIDIA technology and MLOps solutions
  5. Analyzing performance and power efficiency of AI inference workloads on Kubernetes

Skills

Required

  • Deep Learning frameworks (PyTorch, TensorFlow)
  • Python programming
  • GPU orchestration
  • Kubernetes
  • Containerization
  • Orchestration technologies
  • Monitoring solutions for AI deployments
  • Observability solutions for AI deployments
  • LLM inference
  • DL inference

Nice to have

  • DL training at scale
  • DL inference deployment
  • DL inference optimization
  • NVIDIA NIM
  • Dynamo
  • TensorRT
  • TensorRT-LLM
  • C/C++ programming
  • Profiling
  • Code optimization
  • Performance analysis
  • Test design
  • Parallel programming
  • Distributed computing platforms

What the JD emphasized

  • 8+ years of hands-on experience with Deep Learning frameworks such as PyTorch and TensorFlow
  • Excellent knowledge of the theory and practice of LLM and DL inference
  • Prior experience with DL training at scale, deploying or optimizing DL inference in production

Other signals

  • customer-facing
  • inference
  • deployment
  • MLOps