Member of Technical Staff - Pre Training - Mai Superintelligence Team

Microsoft Microsoft · Big Tech · Mountain View, CA +4 · Software Engineering

This role is focused on training frontier AI foundation models at Microsoft AI, specifically within the Pre-Training team of the Superintelligence Team. The responsibilities include developing algorithms, model architectures, data mixtures, and scaling laws for large-scale training, driving implementations, conducting experiments, and overseeing training runs. The role emphasizes collaboration with infrastructure, data, post-training, and multimodality teams.

What you'd actually do

  1. Develop algorithms, model architectures, data mixtures, and scaling laws for large-scale training using a rigorous data-driven approach grounded in meticulous ablations
  2. Drive algorithmic implementations, conduct experiments, and oversee flagship training runs on our in-house large-scale distributed stack
  3. Collaborate closely with teams on infrastructure, data, post-training, and multimodality

Skills

Required

  • Bachelor's Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline
  • 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python

Nice to have

  • Master's Degree in Computer Science or related technical field
  • 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • Demonstrated experience in large-scale AI
  • Passion for conversational AI and its deployment
  • Demonstrated written and verbal communication skills with the ability to work closely with cross-functional teams, including product managers, designers, and other engineers
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies in AI
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team.

What the JD emphasized

  • exceptional publication track record
  • significant technical leadership in high-impact projects
  • large-scale distributed systems

Other signals

  • train the world's most capable AI frontier models
  • pushing the boundaries of scale, performance and product deployment
  • deliver one of the best foundation models in the world
  • next generation of systems that will transform the field
  • push the boundaries of AI toward Humanist Superintelligence