Research Engineer/scientist (all Levels), Efficient Models

ByteDance ByteDance · Big Tech · San Jose, CA · R&D

Research Engineer/Scientist focused on developing efficient algorithms and architectures for large-scale generative and multimodal models, with an emphasis on model distillation, compression, and hardware-efficient inference for applications like image generation, video generation, and VLMs.

What you'd actually do

  1. Develop efficient algorithms and architectures for large-scale generative and multimodal models, using techniques such as step distillation, cfg distillation, quantization, and other methods to improve model efficiency (e.g., image generation, video generation, VLM).
  2. Advance scalable generative modeling approaches, including diffusion and autoregressive models, with a focus on acceleration and efficiency.

Skills

Required

  • B.S. in Computer Science or related fields, or equivalent experience
  • Expertise in efficient models with deep understanding of computational bottlenecks and acceleration methods.
  • Proficiency in training generative AI or LLM models using widely adopted frameworks and tools such as PyTorch and JAX.
  • Strong communication and collaboration skills in fast-paced environments.

Nice to have

  • Ph.D. in GenAI, MLSys or equivalent experience
  • Extensive research experiences in broad GenAI, MLSys, LLM areas.
  • Proven experiences in at least one of the following areas: image/video generation and editing; model compression (e.g., quantization, step/cfg distillation); efficient architectures (e.g., MoE, window attention); efficient model design; or reinforcement learning training methods (e.g., RLHF, DPO, GRPO).

What the JD emphasized

  • distillation
  • compression
  • model efficiency
  • scalable training
  • optimization
  • deployment
  • model acceleration
  • hardware-efficient inference
  • image generation
  • video generation
  • VLM
  • diffusion
  • autoregressive models
  • quantization
  • step distillation
  • cfg distillation
  • RLHF
  • DPO
  • GRPO

Other signals

  • Developing methods and infrastructure for transferring capabilities from foundation models into smaller, more efficient models
  • scalable training, optimization, and deployment
  • distillation frameworks, model acceleration, hardware-efficient inference