System Software Engineer - Deep Learning

NVIDIA NVIDIA · Semiconductors · Bangalore, India

System Software Engineer at NVIDIA focused on accelerating deep learning inference for autonomous driving systems using NVIDIA GPUs and DL accelerators. The role involves developing SDKs/frameworks for LLMs and state-of-the-art models, benchmarking, and optimizing for latency, accuracy, and power consumption. Requires experience with deep learning frameworks, DNN optimization, and C/C++.

What you'd actually do

  1. Develop solutions around NVIDIA GPU and Deep learning accelerators ( DLA )to accelerate inference ADAS Systems
  2. Develop SDKs / Frameworks to accerate LLMs and state of art models for NVIDIA Drive Platform
  3. Conduct benchmarking and evaluation activities to continuously improve inference latency, accuracy and power consumption of the models
  4. Stay up to date with the latest research and innovations in deep learning, implement and experiment with new ideas to improve NVIDIA's automotive DNNs
  5. Responsible for the technical relationship and assisting the automotive customer in building creative solutions based on NVIDIA technology

Skills

Required

  • BS or MS degree in Computer Science, Computer Engineering or Electrical Engineering
  • 5+ Years of Experience in developing or using deep learning frameworks (e.g. TensorFlow, Keras, PyTorch, Caffe, ONNX, etc.)
  • proven experience in optimizing DNN Layers for GPU or other DSPs
  • Understanding of compilers infrastructure like LLVM and MLIR and assciated flow of optimization for DL accelerators
  • Proficiency in C and C++ and Data Structures
  • Strong OS fundamentals and knowledge of CPU/GPU architecture
  • Familiar with state-of-the-art CNN/LSTM/Transformers architecture

What the JD emphasized

  • 5+ Years of Experience in developing or using deep learning frameworks
  • proven experience in optimizing DNN Layers for GPU or other DSPs

Other signals

  • NVIDIA DRIVE AI platform
  • accelerate inference ADAS Systems
  • accelerate LLMs and state of art models
  • improve inference latency, accuracy and power consumption
  • optimizing DNN Layers for GPU or other DSPs