Senior Software Engineer for Robotics Research - Tooling

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +1 · Remote

Senior Software Engineer focused on building ML productivity tooling and CI/CD frameworks for robotics research, specifically for humanoid robots and foundation models. The role involves developing visualization tools, applying AI agents to improve programming efficiency, and collaborating with researchers to deliver full-stack solutions.

What you'd actually do

  1. Build highly scalable, robust, and efficient CI/CD frameworks. The workload is data intensive and requires CPU/GPU heterogeneous computation.
  2. Build world-class visualization tools for analyzing and optimizing for all our datasets and compute jobs (across 10s of thousands of GPUs)
  3. Develop and apply AI agents to significantly improve programming efficiency within the team, and decrease the human effort in fixing job failures
  4. Overall, collaborate with researchers to gather requirements, understand tooling / visualization / automation needs, and deliver full-stack solutions that move the needle with speed of light.

Skills

Required

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience
  • 12+ years of full-time industry experience in large-scale MLOps and AI infrastructure.
  • Strong experience in full-stack software development, with a focus on building CI/CD or visualization tools.
  • Proficient in both front-end and back-end programming with Python, JavaScript, SQL, or similar.
  • Familiar with modern web front/back end technologies like React, Node.js.
  • Knowledge of GPU technologies like CUDA and NCCL

Nice to have

  • Master’s or PhD’s degree in Computer Science, Robotics, Engineering, or a related field
  • Demonstrated Tech Lead experience, coordinating a team of engineers and driving projects from conception to deployment
  • Strong experience at building and operating large-scale tooling infrastructure in production
  • Strong background and curiosity in frontier AI research
  • PyTorch
  • Ray
  • Kubernetes

What the JD emphasized

  • 12+ years of full-time industry experience in large-scale MLOps and AI infrastructure.
  • Strong experience in full-stack software development, with a focus on building CI/CD or visualization tools.
  • Knowledge of GPU technologies like CUDA and NCCL
  • Strong experience at building and operating large-scale tooling infrastructure in production

Other signals

  • ML productivity tooling
  • humanoid robots
  • foundation models
  • robot learning
  • embodied AI
  • physics simulation
  • AI agents
  • large-scale robot learning