Data Center Engineer, Hpc and AI

NVIDIA NVIDIA · Semiconductors · Yokneam, Israel +2

NVIDIA is seeking an HPC and AI Data Center Engineer to build and manage complex data center infrastructure for supercomputers and HPC/AI clusters. The role involves planning, installation, operation, and troubleshooting of hardware, software, and networking components to support research and development activities in AI and GPU computing.

What you'd actually do

  1. Plan and build complex cluster and supercomputers in various of data center and labs
  2. Rack stack and cable management to ensure efficient use of space and easy maintenance
  3. Ensure data centers and labs power and cooling efficiency while optimizing rack space utilization
  4. Data centers and labs daily operation and support
  5. Installations for variety of infrastructure and solutions - Cloud, VMs, Storage, Network, HPC and AI

Skills

Required

  • MCSE or MCITP/CCNA certification
  • Linux troubleshooting
  • Linux & Windows Core Services: DHCP, DNS, NIS, AD, etc.
  • Team Work
  • Service oriented
  • organized

Nice to have

  • Scripting experience in Bash and/or Python
  • configuration managements tools known in the community (e.g. Ansible, puppet)
  • CI & Known Job schedulers tools (e.g. Jenkins, SLURM)
  • Virtualization: KVM / VMware / Hyper-V
  • Experience with L2 & L3 network protocols

What the JD emphasized

  • 3+ years of experience as lab manager
  • Experience in supporting large and complex data centers
  • Proven hands-on experience in Linux troubleshooting with good problem identification, resolution and solving skills.