Senior Software Engineer, Dl Libraries Infrastructure

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +4 · Remote

NVIDIA is seeking a Senior Software Engineer to design and develop scalable infrastructure for their deep learning libraries, focusing on build, test, integration, and release processes across various platforms. The role involves working with industry-standard tools and contributing to open-source communities.

What you'd actually do

  1. Designing and developing software for testing and analysis of our codebases
  2. Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries
  3. Developing throughout the software stack, from the user experience down to the cluster and database layers
  4. Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc)
  5. Advancing innovative in those industry-standard tools and upstreaming contributions to the open source community

Skills

Required

  • BS or equivalent experience or higher degree in Computer Science or Computer Engineering with 5+ years of relevant experience
  • Strong programming skills in Python (or similar) and familiarity with C/C++ development
  • Experience setting up, maintaining, and automating continuous integration systems
  • Proficiency in SCM (e.g. Git, Perforce) and build systems (e.g. Make, CMake, Bazel)

Nice to have

  • Experience designing and developing automation in Jenkins, Gitlab CI/CD, or Github Actions
  • background with distributed systems and cluster/cloud computing (e.g. Slurm, containers, Kubernetes, etc)
  • Experience designing and developing unit and integration test frameworks with hands-on experience using code coverage and static code analysis tools
  • Success leading a team of engineers and/or experience as an active contributor to a software project involving many developers
  • Knowledge of GPU computing systems and experience with mobile/embedded platforms and multiple operating systems (Ubuntu, CentOS, Windows, L4T, or similar)
  • Track record of identifying useful new technologies and incorporating them into SW development flows

What the JD emphasized

  • passion for “it just works” automation