Senior Infrastructure Automation Engineer - Scm and Hpc AI

NVIDIA NVIDIA · Semiconductors · Bangalore, India

NVIDIA is seeking a Senior Infrastructure Automation Engineer to build automations for large fleets of servers in a cluster environment, focusing on application, OS, and server hardware components. The role involves creating solutions to improve infrastructure reliability and performance, deploying improvements globally using automated orchestration tools, and evaluating technology options. The engineer will also collaborate with project members to define solutions, create schedules, and lead continuous improvements, ultimately aiming to enhance the productivity of chip designers and software engineers.

What you'd actually do

  1. You'll be on the team being responsible for the building automations for large feet of servers in a large cluster environment, including application, OS, and server hardware components, developing the continued automation and innovation needed for our large environment.
  2. Create new solutions to improve the reliability and performance of our ever-growing infrastructure, and work with automated orchestration tools to deploy those improvements to thousands of systems worldwide.
  3. As part of a distributed team, you will evaluate technology options. You will collaborate with project members to define solutions, create schedules, and lead continuous improvements and support.
  4. Learn and greatly improve the daily productivity of the world’s top chip designers and software engineers.

Skills

Required

  • Baremetal provisioning automation and optimisation
  • architected and implemented distributed systems
  • configured/deployed Continuous Integration (CI) and Continuous Deployment (CD) systems
  • software engineering process skills
  • DevOps or system administration with Linux systems
  • Ansible experience
  • excellent interpersonal skills
  • written and verbal communication

Nice to have

  • object-oriented programming and design pattern knowledge and background
  • Go, Python, Object Oriented Perl, or Java
  • MySQL or Postgres
  • NoSQL databases
  • CentOS/RHEL and Ubuntu
  • out-of-the-box thinking

What the JD emphasized

  • strong software engineering process skills required
  • Experience with DevOps or system administration with Linux systems required