Senior Devops Engineer

NVIDIA NVIDIA · Semiconductors · Pune, India

Senior DevOps Engineer role focused on building and maintaining Kubernetes-based infrastructure for AI development, compute, and testing environments. The role involves architecting scaling operations, managing compute infrastructure, designing AI tools for automation, and prototyping cloud infrastructure for NVIDIA.

What you'd actually do

  1. Architect the scaling operation in our data centers. Deploy and Support end-to-end container management solution with Kubernetes, Docker, containerd. Design solutions with service discovery, networking, monitoring, logging, scheduling in Kubernetes.
  2. Setup and Manage end to end Compute Infrastructure using PaaS & IaaS services - tools, plugins, nodes, user management, back up, restore, monitoring, etc. Design and develop AI tools needed for automating maintenance of 35000+ hosts with only 12 support engineers.
  3. Design and build sophisticated automations and AI powered applications.
  4. Use your depth in algorithms and system software background!
  5. Work in teams to deploy new data center infrastructure.

Skills

Required

  • Kubernetes
  • Docker
  • containerd
  • Python
  • Golang
  • Java
  • Ansible
  • Chef
  • Puppet
  • Jenkins
  • VMs
  • SQL
  • NoSQL
  • Elastic Search
  • MongoDB
  • MySQL
  • Kibana
  • Grafana
  • Splunk
  • Zabbix
  • Nagios

Nice to have

  • Slurm
  • Open Stack
  • DevOps
  • SRE

What the JD emphasized

  • AI tools needed for automating maintenance
  • AI powered applications
  • reuse AI techniques

Other signals

  • AI tools for automation
  • AI powered applications
  • reuse AI techniques
  • cloud infrastructure for Nvidia