Senior Solutions Architect, Networking - Csp

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1

This role is for a Senior Solutions Architect focused on networking for AI/ML and HPC on hyperscalers. The individual will work with large customers to develop and demonstrate solutions using NVIDIA's hardware and software, debug issues, manage feature requests, and provide technical insight into AI infrastructure and performance.

What you'd actually do

  1. Working with tech giants to develop and demonstrate solutions based on NVIDIA’s groundbreaking software and hardware technologies. A big part of the day to day job is to help customers debug issues, manage feature request processes, and coordinate the co-engineering program.
  2. Partner with Sales Account Managers and Developer Relations Managers to identify and secure business opportunities for NVIDIA products and solutions.
  3. Be the go-to technical resource for customers building complex AI infrastructure as well as helping them understand performance characteristics for solutions
  4. Work with customers to build PoCs for solutions to address critical business needs
  5. Prepare and deliver technical content to customers including presentations, workshops, etc.

Skills

Required

  • BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, or other Engineering fields or equivalent experience.
  • 8+ years of engineering(performance/system/solution) experience
  • Expertise with dense datacenter design including networking, compute and storage. Familiarity with Cloud and hybrid cloud Network.
  • Experience with Ethernet L2-L7 networking protocol stack.
  • Strong analytical and problem-solving skills
  • Ability to multitask efficiently in a dynamic environment, ability to work with teams across geographical locations
  • Clear written and oral communications skills with the ability to effectively collaborate with executives and engineering teams.

Nice to have

  • Deep understanding and hands on experience with Ethernet Switch software solution and data center production network.
  • Strong experience with RDMA/RoCE and smart NICs
  • Hands on experience with GPU and Networking systems in general including but not limited to performance testing/tuning, benchmarking, etc.
  • Strong systems engineering, coding, and debugging skills. Including experience with Python, Ansible, Go, C/C++, Bash, Linux and Windows.
  • Experience in supporting Deep Learning, Machine Learning or HPC networking infrastructures; experience with networking technologies.

What the JD emphasized

  • AI/ML and HPC on hyperscalers
  • complex AI infrastructure
  • AI workload and systems performance