Senior Solutions Architect Networking Ethernet - Nvis

NVIDIA NVIDIA · Semiconductors · Australia · Remote

NVIDIA is seeking a Senior Solutions Architect for Networking Ethernet to join its NVIDIA Infrastructure Specialist Team. This role involves building and supporting AI/HPC infrastructure for customers, focusing on networking, system design, and automation. The architect will engage with customers and internal teams to define and implement large-scale networking projects, ensuring the operational health and reliability of AI clusters.

What you'd actually do

  1. Primary responsibilities will include building AI/HPC infrastructure for new and existing customers.
  2. Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting.
  3. Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement.
  4. Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
  5. Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.

Skills

Required

  • networking fundamentals
  • TCP/IP stack
  • data center architecture
  • configuring, testing, validating, and resolving issues in LAN networks
  • medium to large-scale HPC/AI environments
  • EVPN
  • BGP
  • OSPF
  • VXLAN protocols
  • network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS
  • automated network provisioning solutions
  • Ansible
  • Salt
  • Python
  • CI/CD pipelines for network operations
  • Strong focus on customer needs and satisfaction
  • Self-motivated with leadership skills
  • work collaboratively with customers and internal teams
  • Strong written, verbal, and listening skills in English

Nice to have

  • cloud networks (AWS, GCP, Azure)
  • Linux or Networking Certifications
  • High-performance computing architectures
  • job schedulers (Slurm, PBS)
  • Cluster management technologies
  • BCM (Base Command Manager)
  • GPU (Graphics Processing Unit) focused hardware/software

What the JD emphasized

  • excellent interpersonal skills
  • customer focused team
  • customer needs and satisfaction
  • customer
  • customers
  • customer