Infrastructure Software Engineer, Deep … at NVIDIA

What you'd actually do

Designing and developing software for testing and analysis of our codebases

Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries

Developing throughout the software stack, from the user experience down to the cluster and database layers

Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Github, Gitlab, Jira, etc)

Advancing state of the art in those industry-standard tools

We are now looking for a Infrastructure Software Engineer for Deep Learning Libraries!

NVIDIA's Deep Learning Libraries Group is seeking excellent software engineers to enable the next wave of NVIDIA's highest performing deep learning libraries. The role spans multiple products, including TensorRT and TensorRT-LLM. The mission is to design and develop scalable, modular infrastructure that streamlines development, build, and test across NVIDIA's diverse set of platforms, from Drive AGX for autonomous vehicles to DGX servers for datacenters and large language models. Join our technically diverse team of software engineers and infrastructure experts to design the systems that enable NVIDIA to stay ahead of the competition as we deliver the world's fastest deep learning platforms.

What you'll be doing:

Designing and developing software for testing and analysis of our codebases
Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
Developing throughout the software stack, from the user experience down to the cluster and database layers
Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Github, Gitlab, Jira, etc)
Advancing state of the art in those industry-standard tools

What we need to see:

BS or equivalent experience or higher degree in Computer Science or Computer Engineering
2+ years of relevant experience
Strong programming skills in Python (or similar) and familiarity with C/C++ development
Experience setting up, maintaining, and automating continuous integration systems (e.g. Jenkins)
Fluency in SCM (e.g. Git, Perforce) and build systems (e.g. Make, CMake, Bazel)
A pragmatic approach to solving problems and collaboration
Passion for "it just works" automation and enabling team members

Ways to stand out from the crowd:

Experience designing and developing automation in Jenkins with Groovy (or similar)
Background with distributed systems and cluster/cloud computing, especially with Kubernetes
Experience designing and developing unit and integration test frameworks
Hands-on experience with code coverage and static code analysis tools
Experience with GPU, mobile/embedded platforms and multiple operating systems (Ubuntu, RedHat, Windows, QNX, L4T, or similar)

This is an opportunity to have a wide impact at NVIDIA by improving development velocity across our many deep learning software projects. Are you creative, driven, and autonomous? Do you love a challenge? If so, we want to hear from you!

We are now looking for a Infrastructure Software Engineer for Deep Learning Libraries!

What you'll be doing:

Designing and developing software for testing and analysis of our codebases
Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
Developing throughout the software stack, from the user experience down to the cluster and database layers
Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Github, Gitlab, Jira, etc)
Advancing state of the art in those industry-standard tools

What we need to see:

BS or equivalent experience or higher degree in Computer Science or Computer Engineering
2+ years of relevant experience
Strong programming skills in Python (or similar) and familiarity with C/C++ development
Experience setting up, maintaining, and automating continuous integration systems (e.g. Jenkins)
Fluency in SCM (e.g. Git, Perforce) and build systems (e.g. Make, CMake, Bazel)
A pragmatic approach to solving problems and collaboration
Passion for "it just works" automation and enabling team members

Ways to stand out from the crowd:

Experience designing and developing automation in Jenkins with Groovy (or similar)
Background with distributed systems and cluster/cloud computing, especially with Kubernetes
Experience designing and developing unit and integration test frameworks
Hands-on experience with code coverage and static code analysis tools
Experience with GPU, mobile/embedded platforms and multiple operating systems (Ubuntu, RedHat, Windows, QNX, L4T, or similar)

Infrastructure Software Engineer, Deep Learning Libraries

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized