Senior Deep Learning Frameworks Sustaining Engineer

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior Deep Learning Frameworks Sustaining Engineer at NVIDIA. Responsibilities include back-porting changes, managing open source dependencies, and fixing bugs for enterprise products like TensorFlow, PyTorch, and TensorRT. Focus on ensuring stable dependencies and addressing security vulnerabilities for NVIDIA AI Enterprise.

What you'd actually do

  1. back-porting changes from the mainline branch
  2. keeping track of open source dependency changes
  3. ensuring the latest stable dependencies are being used for our enterprise products
  4. contribute changes to the team to support timely Long Term Support releases for the TensorFlow, PyTorch, TensorRT products
  5. fixing customer reported bugs, integrating bug fixes found in mainline and working with other teams to ensure open source dependencies are patched for security vulnerabilities

Skills

Required

  • Excellent C/C++ programming and software design skills
  • debugging
  • open source integration
  • Utilizing tools involved in building software (Make, Docker, Bazel)
  • packaging systems (Debian, pip, npm, etc.)
  • Build Systems (Gitlab, CI/Jenkins)
  • machine learning algorithms and frameworks (TensorFlow, PyTorch, or MXNet)
  • Ability to work independently
  • contribute to the stability of releases
  • effectively communicate status

Nice to have

  • Python experience
  • GPU programming experience (CUDA or OpenCL)
  • Experience with contributions to or managing large open source project
  • use of github, bug tracking, branching and merging code, OSS licensing issues, managing patches, etc.
  • Familiarity with Gitlab CI pipelines

What the JD emphasized

  • enterprise products
  • security vulnerabilities

Other signals

  • enterprise products
  • open source code bases
  • customer reported bugs
  • security vulnerabilities