Member of Technical Staff - Copilot AI Evaluation Engineering Manager

Microsoft Microsoft · Big Tech · Mountain View, CA +2 · Software Engineering

Lead a team of engineers to build and manage LLM evaluation solutions for Microsoft Copilot, focusing on quality, reliability, and scalability. This role involves designing evaluation platforms and techniques to measure and improve the performance of AI companions.

What you'd actually do

  1. Hire, manage, and lead a team of software engineers, AI engineers, and machine learning engineers responsible for delivering world-class LLM evaluation solutions.
  2. Collaborate with Eng and Product leadership to prioritize features and improve our world-class AI companion to the world
  3. Design and build evaluation platforms, novel evaluation techniques, and agentic solutions for measuring and improving copilot quality.
  4. Drive implementation of features and systems, breaking down long-term goals into clear milestones, aligning with release plans, and ensuring cross-team coordination.

Skills

Required

  • Bachelor's Degree in Computer Science, or related technical discipline
  • 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • people management experience

Nice to have

  • Master's Degree in Computer Science or related technical field
  • 8+ years technical engineering experience
  • Hands on experience with ML and/or AI evaluation
  • Experience leading engineering teams to deliver large-scale software systems, preferably in AI, machine learning, graphics or related fields.

What the JD emphasized

  • LLM evaluation solutions
  • measuring and improving copilot quality
  • evaluation platforms
  • novel evaluation techniques
  • agentic solutions

Other signals

  • LLM evaluation solutions
  • measuring and improving copilot quality
  • design and build evaluation platforms