AI Factory Deployment Engineer

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

This role supports the deployment of control systems for NVIDIA's AI Factories, focusing on data center infrastructure. Responsibilities include requirements gathering, adapting control system designs, technical evaluation of control systems, and providing technical support. The role also involves IT to OT data integration for applications like digital twins and agentic AI onboarding, and standardization in controls engineering.

What you'd actually do

  1. Collaborate with product owners and technical leads to identify and collect requirements for our next-generation data centers.
  2. Support the global design standards for the data center controls and monitoring (DCCM) system, collaborating with internal teams to develop an execution strategy and life cycle management
  3. Responsible for adapting control system reference designs and standards to AI Factory deployments.
  4. Key collaborator responsible for control system technical evaluation from site selection due diligence through site turnover to operations including: contractor selection, bid package development, MEP or equivalent experience and control system composition review, RFI response, submittal/as built reviews, and commissioning support.
  5. Support IT to OT data integration enabling digital twins, agentic AI onboarding, coordinated leak detection and other applications.

Skills

Required

  • BS in Engineering, CS or equivalent experience
  • 8+ years of experience with control system design, development and management on industrial or mission critical systems
  • Working knowledge of mechanical, electrical, life safety, and IT Networking systems associated with critical environments
  • Understanding of OPC-UA, and Modbus (TCP & RTU) protocols and how to integrate using these protocols.
  • Troubleshooting, problem-solving skills and experience driving root cause analysis to complex projects under pressure
  • Experience with equipment commissioning, testing, or related activities
  • Experience with startup and configuration of Programmable Logic Controllers (PLCs) and SCADA workstations.
  • Strong understanding of Sequence of Operations (SOO) for mechanical system control. Ability to create and iterate on SOOs.

Nice to have

  • Experience with MQTT communication protocol, higher level data strategies, and integration to IT systems
  • Strong understanding of data center commissioning including Level 1 through Integrated Systems Testing
  • Strong understanding of document control and change control processes
  • Working knowledge and experience with Data Center Infrastructure Management (DCIM), EPMS systems, Ignition SCADA software development and deployment, and programming languages: Python, PHP, SQL
  • Working knowledge of data center power and cooling solutions, including advanced systems such as liquid cooling

What the JD emphasized

  • 8+ years of experience with control system design, development and management on industrial or mission critical systems
  • Working knowledge of mechanical, electrical, life safety, and IT Networking systems associated with critical environments
  • Experience with equipment commissioning, testing, or related activities
  • Experience with startup and configuration of Programmable Logic Controllers (PLCs) and SCADA workstations.