Sr. Multimodal Model Training and Inference Optimization Engineer

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

Seeking an experienced engineer to optimize large-scale multimodal generative AI model training and inference pipelines, focusing on distributed training strategies and performance bottlenecks for consumer-facing applications like TikTok.

What you'd actually do

  1. Optimize large model training pipelines to improve efficiency, speed, and scalability.
  2. Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.
  3. Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.

Skills

Required

  • Python
  • C++
  • CUDA
  • PyTorch
  • Megatron
  • Deepspeed
  • distributed training
  • transformers
  • diffusion models

Nice to have

  • M.S or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or a related field
  • publications at conferences such as MLSys, NeurIPS, ICLR, or ICML
  • implementing and optimizing complex and performance-critical systems

What the JD emphasized

  • 3 years+ experience in AI model training optimization

Other signals

  • optimizing AI model training and inference
  • large-scale generative AI models
  • distributed training/inference and acceleration