Sr. Multimodal Model Training and Inference Optimization Engineer

ByteDance ByteDance · Big Tech · Seattle, WA · R&D

Seeking an experienced Multimodal Model Training and Inference Optimization Engineer to optimize AI model training and inference, including distributed training/inference and acceleration, for large-scale generative AI models. Responsibilities include optimizing training pipelines, developing distributed training strategies, and benchmarking/profiling models.

What you'd actually do

  1. Optimize large model training pipelines to improve efficiency, speed, and scalability.
  2. Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.
  3. Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.

Skills

Required

  • Python
  • C++
  • CUDA
  • PyTorch
  • Megatron
  • Deepspeed
  • distributed training
  • transformers
  • diffusion models

Nice to have

  • publications in MLSys, NeurIPS, ICLR, or ICML
  • implementing and optimizing complex and performance-critical systems

What the JD emphasized

  • 3 years+ experience in AI model training optimization
  • Strong software engineering skills
  • Strong proficiency in deep learning frameworks
  • Experience with distributed training techniques
  • Knowledge of transformers and diffusion models

Other signals

  • optimizing AI model training and inference
  • distributed training/inference and acceleration
  • large-scale generative AI models