Infrastructure Team Manager

NVIDIA NVIDIA · Semiconductors · Raanana, Israel

NVIDIA is seeking an experienced IT/Lab Manager to lead the planning, deployment, and operations of their physical lab environment and IT systems. The role involves building and maintaining scalable, reliable, and secure environments to support engineering teams in research, QA, and validation, as well as internal collaborators. Responsibilities include managing day-to-day operations, leading and mentoring an IT/Lab team, collaborating with R&D and other engineering teams, overseeing data center and lab operations, managing procurement and vendors, implementing automation, and maintaining monitoring/alerting systems. The ideal candidate will have at least 10 years of IT/systems administration experience with Linux/Unix, 3+ years in a management role, and experience with data center management, automation, and monitoring.

What you'd actually do

  1. Own day-to-day operations, planning, and roadmap for the engineering lab and IT infrastructure (servers, storage, networking, and related services).
  2. Lead and mentor an IT/Lab team, driving guidelines, standards, and a culture of ownership, partnership, and continuous improvement.
  3. Collaborate closely with R&D, QE, Verification, and other engineering teams to design, provision, and maintain environments that meet their performance, reliability, and security needs.
  4. Lead all aspects of running data center and lab operations, including rack layout, cabling, power and cooling, hardware lifecycle, and resource availability.
  5. Lead procurement and vendor management for hardware, software, and services, including evaluation, negotiation, and ongoing relationship management.

Skills

Required

  • B.Sc. or BA in Computer Science, Engineering, or a related field, or equivalent practical experience
  • 10 years of overall experience in IT / systems administration
  • 3 years of experience in a managerial or team-lead position within IT, lab, or infrastructure teams
  • Vast experience with Linux/Unix system administration, including installation, configuration, troubleshooting, and performance tuning
  • Demonstrable experience collaborating with engineering organizations (R&D, QE, Verification, etc.) and supporting their infrastructure needs
  • Solid experience with data center and lab management, including server, network, and storage equipment deployment and lifecycle
  • Demonstrated experience in procurement and vendor management for infrastructure hardware and software
  • Proficiency in automation and scripting (e.g., shell, Perl, Ansible) for provisioning, configuration, and operational tasks
  • Hands-on experience with monitoring and alerting solutions for infrastructure and services
  • Strong debugging skills and experience resolving complex, cross-domain technical issues

Nice to have

  • Experience with Kubernetes (K8s) in on-prem or hybrid environments
  • Hands-on work with Slurm, HPC clusters, and large-scale compute environments
  • Background in HPC, large-scale Linux clusters, or performance-sensitive engineering environments

What the JD emphasized

  • at least 10 years of overall experience in IT / systems administration
  • at least 3 years of experience in a managerial or team-lead position
  • Vast experience with Linux/Unix system administration
  • Solid experience with data center and lab management