Software Engineer, LLM Inference

NVIDIA · Semiconductors · Shanghai, China

Software Engineer focused on developing and optimizing LLM inference software and frameworks, working with GPU-accelerated libraries and deep learning frameworks.

What you'd actually do

  1. Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance
  2. Performance analysis, optimization and tuning
  3. Closely follow academic developments in the field of artificial intelligence and feature update
  4. Collaborate across the company to guide the direction of machine learning inferencing, working with software, research and product teams

Skills

Required

  • Masters or higher degree in Computer Engineering, Computer Science, Applied Mathematics or related computing focused degree (or equivalent experience)
  • 3+ years of relevant software development experience
  • Excellent C/C++ programming and software design skills, including debugging, performance analysis, and test design
  • Strong curiosity about artificial intelligence, awareness of the latest developments in deep learning like LLMs, generative models
  • Experience working with deep learning frameworks like PyTorch
  • Proactive and able to work without supervision
  • Excellent written and oral communication skills in English
  • Strong customer communication skills, powerfully motivated to provide highly responsive support as needed

What the JD emphasized

  • LLM inference framework developer engineer
  • performance analysis, optimization and tuning
  • deep learning frameworks like PyTorch

Other signals

  • LLM inference framework developer engineer
  • GPU-accelerated libraries
  • performance analysis, optimization and tuning
  • deep learning frameworks like PyTorch