Research Intern - Stac, Nyc (sociotechnical Alignment Center)

Microsoft Microsoft · Big Tech · New York, NY +1 · Applied Sciences

Research Intern position at Microsoft's Sociotechnical Alignment Center (STAC) focusing on evaluating AI systems, particularly generative ones. The role involves applying measurement theory from social sciences and statistics to assess risks, capabilities, and performance. Collaboration with Fairness, Accountability, Transparency, and Ethics in AI (FATE) researchers is expected. The internship emphasizes theoretical and methodological approaches to advance AI system evaluation.

What you'd actually do

  1. conduct research on measuring risks, capabilities, performance, and other properties of AI systems, with a focus on “generative” or “general purpose” systems
  2. collaborations and/or co-mentorship with one or more [FATE researchers](https://www.microsoft.com/en-us/research/theme/fate/people/)
  3. collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community

Skills

Required

  • Currently enrolled in a relevant PhD program

Nice to have

  • Demonstrated ability to conduct original research
  • Able to collaborate effectively with other researchers and product development teams
  • Interpersonal skills, cross-group, and cross-culture collaboration
  • Ability to think unconventionally to derive creative and innovative solutions
  • expertise in theoretical and methodological approaches
  • Prior experience with AI system evaluation
  • backgrounds in technical fields (e.g., machine learning, artificial intelligence, natural language processing, computer vision, statistics)
  • backgrounds in sociotechnical fields (e.g., linguistics, economics, human-computer interaction, information science, educational testing, psychometrics)
  • measurement theory from the social sciences
  • reliability of generative AI systems and generative AI system evaluation
  • validity of generative AI system evaluation
  • automated methods for constructing conceptual definitions for evaluation
  • improving human and automated annotation methods for use in evaluation, including LLM-as-a-judge
  • generative AI user simulations and synthetic data generation
  • linguistic models of conversational organization
  • applications of methodologies from psychometrics and educational testing to the design and validation of evaluations

What the JD emphasized

  • AI system evaluation specifically is preferred
  • theoretical and methodological approaches that can be applied to help advance and mature the field of AI system evaluation
  • measurement theory from the social sciences
  • reliability of generative AI systems and generative AI system evaluation
  • validity of generative AI system evaluation
  • automated methods for constructing conceptual definitions for evaluation
  • improving human and automated annotation methods for use in evaluation, including LLM-as-a-judge
  • generative AI user simulations and synthetic data generation
  • linguistic models of conversational organization
  • applications of methodologies from psychometrics and educational testing to the design and validation of evaluations

Other signals

  • AI system evaluation
  • measuring risks and capabilities
  • sociotechnical alignment