Infrastructure Software Engineer, Deep Learning Libraries

NVIDIA NVIDIA · Semiconductors · Shanghai, China +1

NVIDIA is seeking an Infrastructure Software Engineer to work on deep learning libraries like TensorRT and TensorRT-LLM. The role involves designing and developing scalable, modular infrastructure for development, build, and test processes across various NVIDIA platforms. Responsibilities include software design for testing and analysis, building automation for CI/CD, developing across the software stack, and configuring industry-standard tools like Kubernetes and Jenkins.

What you'd actually do

  1. Designing and developing software for testing and analysis of our codebases
  2. Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
  3. Developing throughout the software stack, from the user experience down to the cluster and database layers
  4. Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Github, Gitlab, Jira, etc)
  5. Advancing state of the art in those industry-standard tools

Skills

Required

  • Python
  • C/C++
  • continuous integration systems
  • SCM
  • build systems

Nice to have

  • Jenkins with Groovy
  • distributed systems
  • cluster/cloud computing
  • Kubernetes
  • unit and integration test frameworks
  • code coverage
  • static code analysis tools
  • GPU
  • mobile/embedded platforms
  • multiple operating systems

What the JD emphasized

  • deep learning libraries
  • TensorRT
  • TensorRT-LLM
  • scalable, modular infrastructure
  • deep learning platforms
  • automation
  • enabling team members