What you'd actually do

Plan and build complex cluster and supercomputers in various of data center and labs

Rack stack and cable management to ensure efficient use of space and easy maintenance

Ensure data centers and labs power and cooling efficiency while optimizing rack space utilization

Data centers and labs daily operation and support

Installations for variety of infrastructure and solutions - Cloud, VMs, Storage, Network, HPC and AI

Skills

Required

MCSE or MCITP/CCNA certification
Linux troubleshooting
Linux & Windows Core Services: DHCP, DNS, NIS, AD, etc.
Team Work
Service oriented
organized

Nice to have

Scripting experience in Bash and/or Python
configuration managements tools known in the community (e.g. Ansible, puppet)
CI & Known Job schedulers tools (e.g. Jenkins, SLURM)
Virtualization: KVM / VMware / Hyper-V
Experience with L2 & L3 network protocols

NVIDIA is looking for an HPC and AI Data Center Engineer to join the networking cloud solutions HPC/AI Infrastructure team. We are focused on building supercomputers and HPC clusters based on groundbreaking technologies. We are looking for a lab manager, be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing. Take part of building large-scale compute and Deep Learning software and hardware platforms, work together and support many scientific researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions.

What you will be doing:

Plan and build complex cluster and supercomputers in various of data center and labs
Rack stack and cable management to ensure efficient use of space and easy maintenance
Ensure data centers and labs power and cooling efficiency while optimizing rack space utilization
Data centers and labs daily operation and support
Installations for variety of infrastructure and solutions - Cloud, VMs, Storage, Network, HPC and AI
Perform troubleshooting - network, optic cabling, bare metal, operating system.
Support Research & Development activities

What we need to see:

MCSE or MCITP/CCNA certification
3+ years of experience as lab manager
Experience in supporting large and complex data centers
Proven hands-on experience in Linux troubleshooting with good problem identification, resolution and solving skills.
In depth knowledge in Linux & Windows Core Services: DHCP, DNS, NIS, AD, etc.
Team Work, Service oriented, organized

Ways to stand out from the crowd:

Scripting experience in Bash and/or Python
Experience with configuration managements tools known in the community (e.g. Ansible, puppet)
CI & Known Job schedulers tools (e.g. Jenkins, SLURM)
Virtualization: KVM / VMware / Hyper-V
Experience with L2 & L3 network protocols

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you!

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.