Currently tracking 427 active AI roles, up 208% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $65k–$331k (avg $193k).
| Title | Stage | AI score |
|---|---|---|
| Member of Technical Staff, Evaluations Engineering - MAI Superintelligence Team This role focuses on building and scaling the evaluation infrastructure for generative AI models on large-scale GPU clusters. It involves developing sophisticated tools and techniques for reliability, performance, and health monitoring, and collaborating with model scientists on evaluation methods and inference strategies. The role also touches on pretraining software development and benchmarking. | Eval GateServe | 9 |
| Senior Software Engineer - Responsible AI (CoreAI) Senior Software Engineer focused on building Responsible AI services, including identifying, measuring, mitigating, and monitoring AI risks across various content types. The role involves designing and developing large-scale distributed cloud services with a focus on safety, governance, inference, evaluation, and multimodal safety infrastructure. |
| Eval GateAgent |
| 8 |
| Senior Researcher - AI & Society - Microsoft Research Senior Researcher at Microsoft Research focusing on the intersection of AI systems and society, with an emphasis on sociotechnical approaches to AI evaluation, responsible AI in industry, and AI safety. The role involves interdisciplinary research, collaboration with industry teams, and a strong publication record. | Eval Gate | 8 |
| Research Intern - AI Evaluation and Alignment Research Intern role focused on advancing the quality, reliability, and evaluation of LLM-based systems by exploring new ML methods for AI assessment and alignment. Responsibilities include co-developing research projects, implementing ML approaches (training/fine-tuning), and developing evaluation frameworks. Requires PhD enrollment in a technical field and hands-on LLM experience. | Eval GatePost-train | 8 |
| Research Intern - STAC, NYC (Sociotechnical Alignment Center) Research Intern position at Microsoft's Sociotechnical Alignment Center (STAC) focusing on evaluating AI systems, particularly generative ones. The role involves applying measurement theory from social sciences and statistics to assess risks, capabilities, and performance. Collaboration with Fairness, Accountability, Transparency, and Ethics in AI (FATE) researchers is expected. The internship emphasizes theoretical and methodological approaches to advance AI system evaluation. | Eval Gate | 8 |
| Senior Security Researcher Senior Security Researcher role focused on threat hunting within Microsoft Defender Experts. The role involves exploring large datasets to detect advanced attack techniques, generating custom alerts, collaborating with data science and threat research teams, and building hunting tools and automations. Requires a strong background in cybersecurity, data analysis, and potentially machine learning, with a focus on enterprise security and threat intelligence. | Eval Gate | 7 |
| Senior Software Engineer Senior Software Engineer role focused on building AI-powered operational excellence for Azure Reliability. The role involves developing evaluation loops, generalizing ML solutions into frameworks, operationalizing prompted classifiers at scale, and ensuring responsible AI practices. | Eval GateAgent | 7 |
| Research Intern - Inference Economics and Human Agency Research intern to conduct empirical research on how human oversight shapes the economic return of AI-assisted work. This includes designing controlled experiments, developing session-level evaluation frameworks that link inference cost to output quality and human effort, and analyzing how interface design choices affect user confidence, reliance, and decision quality during AI-assisted tasks. The role involves collaboration with the MADE team and preparing a submission-ready research manuscript. | Eval GateAgent | 7 |
| Member of Technical Staff, Principal Engineering Manager Seeking an experienced engineering leader to build, scale, and run a high-performing engineering organization responsible for Copilot AI Evaluation. This role involves setting technical and organizational strategy for LLM evaluation, partnering with senior leadership, and owning the delivery of evaluation platforms and novel techniques to measure and improve Copilot quality at scale. | Eval GateAgent | 7 |
| Member of Technical Staff - Copilot AI Evaluation Engineering Manager Lead a team of engineers to build and manage LLM evaluation solutions for Microsoft Copilot, focusing on quality, reliability, and scalability. This role involves designing evaluation platforms and techniques to measure and improve the performance of AI companions. | Eval Gate | 7 |
| Senior Software Engineer - CoreAI Senior Software Engineer to join the Evaluation platform team within Core AI, focusing on building core services for large-scale agent observability and optimizing AI agent performance. | Eval GateAgent | 7 |
| Principal Applied Scientist, Experimentation Platform - CoreAI The Principal Applied Scientist will work on Microsoft's Experimentation Platform (ExP) within CoreAI, focusing on enabling high-scale online experimentation for AI-driven applications. This role involves advancing experimentation methodology and agent evaluations, collaborating with various engineering and science teams, and translating applied research into production features for a large-scale platform. The goal is to accelerate product learning and drive progress across Microsoft's AI ecosystem by providing robust experimentation capabilities. | Eval GateAgent | 7 |
| Principal Researcher - AI & Society - Microsoft Research Principal Researcher at Microsoft Research focusing on the intersection of AI and society, with an emphasis on sociotechnical approaches to AI evaluation, measurement, and responsible AI in industry. The role involves interdisciplinary research, collaboration with engineering and policy teams, and a strong publication record. | Eval Gate | 7 |
| PostDoc Researcher-FATE (Fairness, Accountability, Transparency, and Ethics in AI-Microsoft Research Postdoctoral Researcher position at Microsoft Research NYC focusing on Fairness, Accountability, Transparency, and Ethics in AI (FATE). The role involves pursuing an independent research agenda, collaborating with researchers, and contributing to ongoing projects related to the social implications of machine learning and AI. Research areas include AI evaluation, responsible AI in industry, AI law and policy, transparency, human-AI interaction, and various social impacts of AI. | Eval Gate | 7 |