Senior Networking Performances Architect

NVIDIA NVIDIA · Semiconductors · Tel Aviv, Israel +1

NVIDIA is seeking a Senior Networking Performances Architect to shape the future of high-performance and ML/AI computing. This role will analyze network feature performance for AI workloads on large-scale HPC clusters, develop network behavior models, and generate insights for next-generation products. The ideal candidate will have a strong background in system engineering/architecture, performance research, Python, and a good understanding of AI models and large-scale networks.

What you'd actually do

  1. Analyse the performances of novel network features and their impact on running AI workloads on the next large-scale, high-performance computing (HPC) clusters.
  2. Develop network behaviour models for simulations and performance analysis and prioritize future networking features development.
  3. Generate insights that focus out new generations of products to remain state of the art for AI
  4. Collaborate with cross-functional teams, including deep learning, simulation, architecture, and research.
  5. Become an expert in AI networking on high-performance computing

Skills

Required

  • B.Sc., M.Sc., or Ph.D. degree in Computer Science, Computer Engineer, or Electrical Engineer.
  • 5+ years of system engineering, system architecture or performance research experience.
  • Deep understanding of complex system behaviour
  • Possess strong problem solving and critical thinking skills.
  • Ability to work and operate in a highly dynamic environment, be a team player and collaborate with multiple groups in the organization.
  • Good knowledge in Python.

Nice to have

  • Experience in development of simulation environments.
  • Understanding of large-scale networks behaviour and the effect of distributed computing workloads on the network.
  • Understanding of network protocols (InfiniBand, IP, TCP, RoCE)
  • Good knowledge of AI models.

What the JD emphasized

  • AI workloads
  • AI networking
  • large-scale, high-performance computing (HPC) clusters

Other signals

  • AI workloads
  • AI networking
  • large-scale, high-performance computing (HPC) clusters