Member of Technical Staff - Mid-training

xAI xAI · AI Frontier · Palo Alto, CA · Model

This role focuses on scaling synthetic data for AI training, distilling model intelligence, optimizing data mixtures for RL, engineering long-context data, and developing evaluations for mid-training checkpoints. It requires expertise in ML scaling laws, experimental design, curating multi-modal AI training data, and large-scale data processing frameworks.

What you'd actually do

  1. Scale synthetic coding data to trillions of tokens with large-scale docker verification.
  2. Distill the intelligence of flagship models into flash models through synthetic data generation.
  3. Optimize mid-training data mixtures to boost the ceiling for RL.
  4. Engineer long-context data recipes.
  5. Develop robust and diverse evaluation for mid-training checkpoints.

Skills

Required

  • Expertise in ML and large model scaling
  • Familiarity across all kinds of scaling laws
  • Strong ability to design ML experiments
  • Familiarity with state-of-the-art techniques for curating AI training data for text, image, audio, and video modalities
  • Strong engineering abilities in Spark, Ray, and other frameworks for large-scale data processing

What the JD emphasized

  • trillions of tokens
  • large-scale docker verification
  • synthetic data generation
  • mid-training data mixtures
  • long-context data recipes
  • robust and diverse evaluation

Other signals

  • scaling synthetic data
  • distill intelligence
  • optimize mid-training data mixtures
  • engineer long-context data recipes
  • develop robust and diverse evaluation