Senior Data Scientist

Microsoft Microsoft · Big Tech · Redmond, WA +4 · Data Science

Senior Data Scientist role focused on developing and implementing methodologies to evaluate LLM performance for Copilot, including training classifiers, experimenting with data collection, and providing real-time performance signals. The role involves creating automated evaluation frameworks and working closely with user researchers and product leaders.

What you'd actually do

  1. Leverage expertise to measure the performance of Copilot, identify failure modes and novel mitigation strategies, including data mining, prompt engineering, LLM as a judge, and classifier training.
  2. Create and implement comprehensive evaluation frameworks across diverse scenarios, edge cases, and potential failure modes.
  3. Build automated testing systems, generalize solutions into repeatable frameworks, and write efficient code for model pipelines and intervention systems.
  4. Track advances in research, identify relevant state-of-the-art techniques, and adapt algorithms to drive innovation in production systems serving millions of users.
  5. Maintain a user-oriented perspective by understanding needs from user perspectives, validating approaches through user research, and serving as a trusted advisor on AI matters.

Skills

Required

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) data-science experience
  • Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 3+ years data-science experience
  • Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience OR equivalent experience.
  • Experience prompting and working with large language models.
  • Experience writing production-quality Python code.

Nice to have

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 3+ years data-science experience
  • Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience
  • Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 7+ years data-science experience OR equivalent experience.
  • Demonstrated interest in Responsible AI.

What the JD emphasized

  • evaluate how well Copilot performs
  • automated evaluation frameworks
  • measure the performance of Copilot
  • comprehensive evaluation frameworks
  • automated testing systems

Other signals

  • evaluating LLMs
  • building automated evaluation frameworks
  • real-time signals on Copilot performance