Member of Technical Staff - AI Pretraining - Mai Superintelligence Team

Microsoft Microsoft · Big Tech · London, United Kingdom +2 · Software Engineering

Microsoft AI is seeking individuals to train the world's most capable AI frontier models, focusing on scale, performance, and product deployment. The role involves developing algorithms, model architectures, and data mixtures for large-scale training, driving implementations, conducting experiments, and collaborating with infrastructure, data, post-training, and multimodality teams.

What you'd actually do

  1. Develop algorithms, model architectures, data mixtures, and scaling laws for large-scale training using a rigorous data-driven approach grounded in meticulous ablations
  2. Drive algorithmic implementations, conduct experiments, and oversee flagship training runs on our in-house large-scale distributed stack
  3. Collaborate closely with teams on infrastructure, data, post-training, and multimodality

Skills

Required

  • Bachelor's Degree in Computer Science, or related technical discipline
  • technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Proven expertise in the area of pretraining

Nice to have

  • Demonstrated experience in large-scale AI
  • Passion for conversational AI and its deployment
  • Demonstrated written and verbal communication skills
  • ability to work closely with cross-functional teams, including product managers, designers, and other engineers
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in AI
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team

What the JD emphasized

  • proven expertise in the area of pretraining
  • exceptional publication track record
  • significant technical leadership in high-impact projects

Other signals

  • train the world’s most capable AI frontier models
  • deliver one of the best foundation models in the world
  • large-scale distributed stack