Currently tracking 427 active AI roles, up 208% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $65k–$331k (avg $193k).
| Title | Stage | AI score |
|---|---|---|
| Member of Technical Staff, Evaluations Engineering - MAI Superintelligence Team This role focuses on building and scaling the evaluation infrastructure for generative AI models on large-scale GPU clusters. It involves developing sophisticated tools and techniques for reliability, performance, and health monitoring, and collaborating with model scientists on evaluation methods and inference strategies. The role also touches on pretraining software development and benchmarking. | Eval GateServe | 9 |
| Senior Software Engineer - Responsible AI (CoreAI) Senior Software Engineer focused on building Responsible AI services, including identifying, measuring, mitigating, and monitoring AI risks across various content types. The role involves designing and developing large-scale distributed cloud services with a focus on safety, governance, inference, evaluation, and multimodal safety infrastructure. |
| Eval GateAgent |
| 8 |
| Senior Security Researcher Senior Security Researcher role focused on threat hunting within Microsoft Defender Experts. The role involves exploring large datasets to detect advanced attack techniques, generating custom alerts, collaborating with data science and threat research teams, and building hunting tools and automations. Requires a strong background in cybersecurity, data analysis, and potentially machine learning, with a focus on enterprise security and threat intelligence. | Eval Gate | 7 |
| Senior Software Engineer Senior Software Engineer role focused on building AI-powered operational excellence for Azure Reliability. The role involves developing evaluation loops, generalizing ML solutions into frameworks, operationalizing prompted classifiers at scale, and ensuring responsible AI practices. | Eval GateAgent | 7 |
| Member of Technical Staff, Principal Engineering Manager Seeking an experienced engineering leader to build, scale, and run a high-performing engineering organization responsible for Copilot AI Evaluation. This role involves setting technical and organizational strategy for LLM evaluation, partnering with senior leadership, and owning the delivery of evaluation platforms and novel techniques to measure and improve Copilot quality at scale. | Eval GateAgent | 7 |
| Member of Technical Staff - Copilot AI Evaluation Engineering Manager Lead a team of engineers to build and manage LLM evaluation solutions for Microsoft Copilot, focusing on quality, reliability, and scalability. This role involves designing evaluation platforms and techniques to measure and improve the performance of AI companions. | Eval Gate | 7 |
| Senior Software Engineer - CoreAI Senior Software Engineer to join the Evaluation platform team within Core AI, focusing on building core services for large-scale agent observability and optimizing AI agent performance. | Eval GateAgent | 7 |
| Principal Applied Scientist, Experimentation Platform - CoreAI The Principal Applied Scientist will work on Microsoft's Experimentation Platform (ExP) within CoreAI, focusing on enabling high-scale online experimentation for AI-driven applications. This role involves advancing experimentation methodology and agent evaluations, collaborating with various engineering and science teams, and translating applied research into production features for a large-scale platform. The goal is to accelerate product learning and drive progress across Microsoft's AI ecosystem by providing robust experimentation capabilities. | Eval GateAgent | 7 |
| Member of Technical Staff - Full Stack Software Engineer Full Stack Software Engineer to build capabilities for Microsoft's personalized AI assistant, Copilot. The role involves working across the evaluation platform, including data sampling, collection, processing, analysis, and insight generation. Responsibilities include full-stack development, prompt engineering, leveraging AI tools, and collaborating with teams to assess Copilot's performance, trustworthiness, and visual appeal across various platforms and scenarios like multi-turn conversations with voice input. | Eval Gate | 5 |
| Software Engineer II - CoreAI This role focuses on building core services for an AI evaluation platform, specifically for agent observability within Microsoft's CoreAI group. The engineer will design, implement, and deliver AI services to support product offerings for large-scale agent observability, collaborating with product management and partner teams, and taking end-to-end responsibility for development lifecycle and production readiness. | Eval GateAgent | 5 |
| Principal Software Engineer The Principal Software Engineer will join the M365 Evaluation Platform Team to enhance the evaluation system for AI offerings, supporting millions of users. The role involves building capabilities to enable agile and faster evaluations, providing continuous tools throughout the development lifecycle, and automating tasks via tools or agents to improve performance understanding. The focus is on building reliable, scalable infrastructure and driving quality in products using data, with a platform engineering mindset. | Eval Gate | 5 |