Senior Research Scientist, Fundamental LLM Research for Knowledge, Reasoning, and Agents

NVIDIA · Semiconductors · Santa Clara, CA

NVIDIA is seeking a Senior Research Scientist to conduct fundamental LLM research, focusing on post-training, alignment, synthetic data, reasoning, novel learning paradigms, and multi-modalities. The role involves exploring new capabilities, enabling agency, acquiring commonsense knowledge, publishing research, and collaborating with product groups.

What you'd actually do

  1. Explore alternative avenues to unlock new capabilities in language models, including advanced knowledge acquisition techniques and innovative learning and decoding algorithms.
  2. Innovate new learning paradigms that incorporate agency into the training of language models, such as enabling self-reflection and targeted knowledge enhancement.
  3. Enable learning from multi-modalities beyond written text, such as acquiring physical commonsense knowledge through interactions with real-world environments.
  4. Publish original research.
  5. Collaborate with other team members and teams.

Skills

Required

  • PhD in Computer Science or Computer Engineering (or equivalent experience)
  • At least 6 years of research experience
  • Excellent knowledge of theory and practice of deep learning and natural language processing
  • Excellent programming skills in Python and PyTorch
  • Excellent communications skills

Nice to have

  • Mentoring interns
  • Speaking at conferences and events
  • Working with product groups to transfer technology
  • Collaborating with external researchers
  • Computer vision expertise

What the JD emphasized

  • publication records spanning across 5+ years
  • strong publication record
  • Background in LLM training, alignment, and evaluation is expected.
  • Hands-on experience with large-scale model training including data preparation and model parallelization (tensor and pipeline) is required.

Other signals

  • LLM research
  • post-training and alignment
  • synthetic data generation
  • reasoning and inference algorithms
  • novel learning paradigms
  • LLM architectures
  • fundamental limits and capabilities of LLMs
  • multi-modalities
  • agency into the training of language models
  • physical commonsense knowledge