AI System Research and Development Engineer - Optimization

Snowflake Snowflake · Data AI · WA-Bellevue, United States · Engineering

The role focuses on optimizing LLM inference and training systems, including GPU kernel performance, efficiency, and scalability. It also involves contributing to agentic frameworks and applications. The company has recently released an inference optimization (SwiftKV) and a large MoE foundation model (Arctic LLM).

What you'd actually do

  1. Analyze and optimize GPU kernel performance for training and inference of LLMs.
  2. Develop and implement strategies to enhance the efficiency and scalability of deep learning systems.
  3. Profile and benchmark deep learning systems using tools and techniques to identify bottlenecks.
  4. Design and implement optimizations to reduce latency and improve resource utilization for training and inference.
  5. Contribute to the development of agentic frameworks and applications for LLM-driven workflows, enhancing automation, reasoning, and decision-making capabilities.

Skills

Required

  • GPU kernel optimization
  • deep learning system optimization
  • high-performance computing (HPC)
  • PyTorch
  • TensorFlow
  • JAX
  • GPU architectures
  • CUDA
  • CUTLASS
  • Triton
  • cuDNN
  • nvprof
  • Nsight

Nice to have

  • Master’s degree or PhD
  • publish innovations
  • technical blogs
  • top-tier conferences and journals

What the JD emphasized

  • LLM inference and training system development
  • optimizations
  • agentic systems
  • GPU kernel optimization
  • deep learning system optimization
  • high-performance computing (HPC)

Other signals

  • LLM inference and training system development
  • optimizations
  • agentic systems
  • efficient and scalable generative AI systems