Skills

Required

C/C++
algorithms and data structures
Python
deep learning algorithms
neural networks
Pytorch

Nice to have

GPU high-performance computing optimization
CUDA
computer architecture
parallel computing optimization
memory access optimization
low-bit computing
TensorRT-LLM
ORCA
VLLM
LLM models
accelerating LLM model optimization

The Machine Learning (ML) System sub-team combines system engineering and the art of machine learning to develop and maintain massively distributed ML training and Inference system/services around the world, providing high-performance, highly reliable, scalable systems for LLM/AIGC/AGI

In our team, you'll have the opportunity to build the large scale heterogeneous system integrating with GPU/NPU/RDMA/Storage and keep it running stably and reliably, enrich your expertise in coding, performance analysis and distributed system, and be involved in the decision-making process. You'll also be part of a global team with members from the United States, China and Singapore working collaboratively towards unified project direction.

Responsibilities:

Responsible for developing and optimizing LLM inference framework.
Responsible for GPU and CUDA Performance optimization to create an industry-leading high-performance LLM inference engine.

Requirements

Minimum Qualifications:

Bachelor's degree or above, major in computer/electronics/automation/software, etc.
Proficient in C/C++, proficient in algorithms and data structures, familiar with Python
Understand the basic principles of deep learning algorithms, be familiar with the basic architecture of neural networks and understand deep learning training frameworks such as Pytorch.

Preferred Qualifications:

Proficient in GPU high-performance computing optimization technology on CUDA, in-depth understanding of computer architecture, familiar with parallel computing optimization, memory access optimization, low-bit computing, etc.
Familiar with TensorRT-LLM, ORCA, VLLM, etc.
Knowledge of LLM models, experience in accelerating LLM model optimization is preferred.

Senior Research Scientist - Machine Learning System

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Requirements

Requirements