Microsoft has 521 active AI-related job listings. The majority of these roles are focused on agents, representing 37% of the total, followed by application and serving infrastructure. Engineering is the most frequent function, with a significant number of openings, and the United States is the primary hiring country. Frequent tech tags include agent orchestration, model serving, and LLM observability, suggesting a focus on operationalizing AI models. Over the last 30 days, Microsoft has added 280 new AI roles, a 157% increase compared to the previous 30-day period.
Currently tracking 250 active AI roles, down 24% versus the prior 4 weeks. Primary focus: Agent · Engineering. Salary range $65k–$331k (avg $195k).
Microsoft currently has 343 active AI-related roles in our index. The most common open titles are: Principal Software Engineer (19), Senior Software Engineer (19), Software Engineer II (8), Principal Applied Scientist (7), Principal Data Scientist (4). Most positions are in Engineering and Research.
Microsoft's active AI hiring is concentrated in: agents (36%), application (21%), serving infrastructure (19%). These categories follow a seven-stage AI lifecycle: data, pre-training, post-training, serving infrastructure, agents, evaluation, and application.
Microsoft is hiring AI talent in: United States (308 roles), Canada (15 roles), Japan (8 roles), United Kingdom (7 roles).
Job postings at Microsoft most frequently mention: Computer Architecture, Python, Machine Learning, C#, C++.
In the past 30 days, Microsoft has posted 227 new AI-related roles.
| Title | Stage | AI score |
|---|---|---|
| Member of Technical Staff, Capacity & Efficiency Infrastructure - MAI Superintelligence Team This role focuses on optimizing and managing the compute infrastructure for training large-scale AI models. The responsibilities include designing and implementing distributed training systems, building telemetry for performance monitoring, profiling and debugging bottlenecks, and driving architectural improvements for efficiency. The role requires strong software engineering skills in Python and C++, deep understanding of GPU architectures, and experience with distributed computing systems and ML workloads. | Serve | 9 |
| Member of Technical Staff, Multimodal Infrastructure - MAI Superintelligence Team This role focuses on building and maintaining large-scale infrastructure for multimodal generative models, covering the full development cycle from data processing to training, inference, and serving. It involves working with research scientists and product engineers to optimize performance and drive architectural changes for consumer AI products like Copilot. |
| ServePost-train |
| 9 |
| Member of Technical Staff, Software Co-Design AI HPC Systems - MAI Superintelligence Team This role focuses on the co-design and productionization of next-generation AI systems at datacenter scale, optimizing end-to-end performance and efficiency. It operates at the intersection of models, systems software, networking, storage, and AI hardware, influencing accelerator design, system architectures, and large-scale AI platforms. The role involves analyzing real workloads, developing performance models, and partnering with various teams to drive high-impact ideas into production systems. It also contributes to research and the broader community through publications and open-sourcing. | ServePretrain | 9 |
| Senior AI Hardware Architect Senior AI Hardware Architect role focused on defining and optimizing next-generation AI accelerator platforms and large-scale AI systems. Responsibilities include analytical performance modeling, workload characterization, profiling, and end-to-end performance analysis across GPU and accelerator architectures, working across hardware, software, and system boundaries. The role involves analyzing AI workloads, identifying performance bottlenecks, developing models for new architectural features, and correlating silicon data with models to drive optimizations for performance, efficiency, and TCO. Collaboration with various hardware and software teams is key to shaping future AI accelerator and system architectures. | ServePost-train | 8 |
| Software Engineer II and Sr. Software Engineer - AI Frameworks Develops software for AI/ML frameworks and tools, focusing on ONNX and ONNX Runtime for high-performance inference and training acceleration across various hardware. Also works on on-device AI inference solutions. | Serve | 8 |
| Principal Software Engineer - Performance Principal Software Engineer focused on optimizing the performance of AI model inference, particularly LLMs, across various hardware platforms (GPUs, Microsoft silicon). The role involves deep technical work on the AI software stack, from fundamental abstractions to system-level optimizations, aiming to improve efficiency and reduce costs for large-scale AI deployments, including those for Azure OpenAI service. | Serve | 8 |
| Principal Software Engineer, CoreAI Principal Engineer on the AI Core Infrastructure team, responsible for large-scale GPU management infrastructure and inference/training platforms powering Microsoft's AI workloads. The role involves setting roadmaps, designing backend services, and providing insights for customers to monitor, troubleshoot, and scale AI training workloads on supercomputers. Focus on ML infrastructure, distributed systems, and observability. | ServePost-train | 8 |
| Principal Software Engineering - AI Frameworks Principal Software Engineer on the AI Frameworks team at Microsoft, focusing on developing and optimizing software for running AI models across diverse hardware platforms. This includes working on ONNX, ONNX Runtime for high-performance inferencing and training acceleration, and Foundry Local for on-device inference. | Serve | 8 |
| Senior Software Engineer, CoreAI Workload Engines Senior Software Engineer focused on building and optimizing foundational inference engines and APIs for large-scale AI inference across Azure. The role involves improving latency, throughput, availability, and cost for LLMs, working with OpenAI and open-source models, and developing experimentation capabilities for safe and rapid iteration. | Serve | 8 |
| Principal Software Engineer, CoreAI Workload Engines Principal Software Engineer focused on building and optimizing foundational inference engines and APIs for large-scale AI inference across Azure. The role involves driving production-grade serving improvements for OpenAI and open-source LLMs, focusing on latency, throughput, availability, and cost efficiency. Responsibilities include making hands-on engine changes, building experimentation capabilities, and designing inference serving architectures to support multitenant AI systems at global scale. | Serve | 8 |
| Member of Technical Staff, Developer Experience - MAI Superintelligence Team This role focuses on building and optimizing the infrastructure and developer experience for large-scale ML model training and inference, specifically for Microsoft's AI assistant, Copilot. The responsibilities include improving CI/CD pipelines, developing training tools, enhancing cloud infrastructure, and managing model hosting systems for inference and data generation. The role aims to accelerate iteration and improve the quality of AI models powering innovative products. | ServeData | 8 |
| Member of Technical Staff, LLM Inference - MAI Superintelligence Team This role focuses on building and maintaining tools and systems for LLM inference, optimizing compute efficiency, and enabling researchers to run models for various tasks. It involves working with inference frameworks, GPU kernel programming, and distributed systems to improve model performance. | Serve | 8 |
| Software Engineer 2 Software Engineer to develop AI software for training and deploying advanced AI models, focusing on system software, developer tools, and optimizing large-scale training and inference on novel AI hardware and accelerators. | ServePost-train | 7 |
| Engineering Manager Engineering Manager to lead a team building and operating the cloud brain of Microsoft Defender's real-time protection services. This involves managing ML models, large-scale data platforms, and threat intelligence pipelines that operate at planetary scale with low-latency and high-availability requirements for over a billion users. | Serve | 7 |
| Principal Software Engineer This Principal Software Engineer role focuses on building and managing a hyperscale deployment system, leveraging AI to enhance efficiency, reliability, and automation. The role involves leading a team, mentoring engineers on AI-powered development practices, and driving innovation through AI solutions for deployment intelligence and operational excellence. It requires experience with AI-native development, LLMs, and integrating AI into the software development lifecycle. | Serve | 7 |
| Principal Software Engineer This Principal Software Engineer role focuses on building and operating mission-critical, hyperscale, high-performance, cost-efficient, and compliant AI infrastructure for LLM services within Microsoft 365 and other AI-powered products. The role involves leading the design, implementation, and delivery of an LLM API management service, with a strong emphasis on cost and availability management. | Serve | 7 |
| Principal Group Engineering Manager Principal Group Engineering Manager for AIInfra team at Microsoft, responsible for building and scaling the AI data-plane that powers LLM inferencing workloads across Microsoft and Azure customers. The role involves leading a large team to deliver inference capabilities for a wide range of LLMs with a focus on reliability, efficiency, and ultra-low latency. | Serve | 7 |
| Principal Software Engineer Principal Software Engineer to build and operate mission-critical, hyperscale, high-performance, cost-efficient, and compliant AI infrastructure that powers Microsoft's Large Language Model (LLM) services across Microsoft 365 and other AI-powered products. The role involves leading a team, driving the design and delivery of the AI inferencing platform, and ensuring platform cost efficiency, availability, and operational excellence. | Serve | 7 |
| Software Engineer II Software Engineer II role focused on building and operating mission-critical, hyperscale, high-performance, cost-efficient, and compliant AI infrastructure for LLM services within Microsoft 365 and other AI-powered products. The role involves leading the design, implementation, and delivery of LLM API management services, with a strong emphasis on cost and availability management, and collaboration across product teams. | Serve | 7 |
| Senior Consultant - Data & AI Senior Consultant role focused on designing and delivering AI-powered data solutions on Microsoft Azure. The role involves leading technical delivery, defining technology strategy, and implementing solutions with an AI-first mindset, acting as a hands-on contributor and potentially leading engineering teams. Emphasizes cloud-native architectures, data engineering principles, and rapid prototyping for production-ready AI solutions. | Serve | 7 |
| Senior Software Engineer The AI Core Infrastructure team is responsible for building and managing large-scale GPU management infrastructure and inference/training platforms for Microsoft's AI workloads. This Senior Software Engineer role focuses on fleet management, designing and developing core AI infrastructure services, and managing GPU clusters for LLM training and inference. | ServePost-train | 7 |
| Member of Technical Staff, Microsoft Robotics (Software Systems) This role focuses on the reliability, observability, and operational health of a production robotics platform that integrates humans, robots, and AI agents. It involves designing and operating observability infrastructure, incident response, deployment pipelines, secure cloud-to-edge communication, and capacity planning for robotics workloads. The role requires a strong background in SRE and systems engineering for both cloud and edge components. | ServeAgent | 7 |
| Principal Software Engineer Principal Software Engineer role focused on building and scaling large distributed systems for search, recommendation, and AI services, specifically within the Bing IndexServe team. The role involves architecting and driving cutting-edge techniques like LLM, Ranking, and Index Serving on a massive scale (100K+ nodes), collaborating with ML/AI data scientists. The team aims to simplify the serving stack, improve relevance innovations with deep learning and LLMs, and build an agile, performant, stable, and efficient index serving platform that supports rapid implementation and iteration of relevance techniques and advanced AI toolsets. | Serve | 7 |
| Senior Software Engineer - Performance Senior Software Engineer focused on optimizing the inference performance of large language models (LLMs) like those from OpenAI, running on various hardware including GPUs and custom Microsoft silicon. The role involves benchmarking, debugging, and optimizing performance to enable efficient deployment at scale for major Microsoft products and Azure services. | Serve | 7 |
| Principal Silicon Performance Architect This role focuses on optimizing the performance of AI inference workloads by exploring micro-architectural innovations and validating end-to-end performance. The Principal Silicon Performance Architect will own performance modeling, analysis, and simulation infrastructure, working closely with chip, system, and software architects to drive data-backed design decisions for improved throughput, latency, and efficiency. | Serve | 7 |
| Principal Software Engineer Principal Software Engineer to design and build a Postgres-based database for modern, AI-native, agent-driven workloads within Microsoft Fabric. The role involves innovating on query planning, execution, and storage layers to support high-performance data access for next-generation applications, leveraging open storage formats and engines. | Serve | 7 |
| ML - Principal Software Engineer Principal Software Engineer role focused on building high-performance software for AI capabilities across Windows & Devices. The role involves architecting and building code for deploying ML models at scale, optimizing edge execution, and guiding system-level decisions for inference, memory, power, and security. It requires defining ML infrastructure strategy and has preferred experience in architecting ML inference pipelines for LLMs, local model integrations, and hardware-aware optimizations. | Serve | 7 |
| Member of Technical Staff, Full Stack - ML Efficiency & Observability - MAI Superintelligence Team Full Stack Engineer on the MAI Superintelligence Team focused on ML Efficiency & Observability, building capacity management portals and visibility into model performance for ML researchers and executives. The role involves designing and developing features for user interfaces, integrating with backend APIs for training frameworks, and contributing to internal tooling and infrastructure. | Serve | 7 |
| Principal AI Network Architect This role focuses on the network architecture for AI accelerator platforms, specifically for high bandwidth and low latency networks critical for AI GPU clusters. The Principal AI Network Architect will evaluate, design, and optimize the network stack from hardware to software kernels, influencing Azure product roadmaps and working with state-of-the-art networking labs. The role requires deep expertise in networking technologies and familiarity with AI model execution pipelines. | Serve | 7 |
| Member of Technical Staff, Site Reliability Engineer (HPC) - MAI SuperIntelligence Team The role is for a Site Reliability Engineer (SRE) focused on High Performance Computing (HPC) infrastructure for AI model training and inference. The engineer will ensure the reliability, availability, and efficiency of large-scale distributed AI systems, including GPU clusters, and will be involved in monitoring, automation, incident management, and security. | Serve | 7 |
| Member of Technical Staff, HPC Operations Engineering Manager This role manages a team of Site Reliability Engineers responsible for the reliability and efficiency of large-scale distributed AI infrastructure, specifically for training, fine-tuning, and serving generative AI models. The focus is on leading operations, observability, automation, incident management, and security within hybrid cloud/on-prem CPU+GPU environments, collaborating closely with ML engineers and platform teams. | ServePost-train | 7 |
| Software Engineer Software Engineer role focused on building and scaling the inferencing cloud for Large Language Models and GenAI Services within Azure CoreAI Platform. The role involves designing, building, and operating large-scale engineering systems for AI models. | Serve | 7 |
| Senior Software Engineer Senior Software Engineer role focused on designing, developing, and optimizing Azure's High Performance Computing and AI Platform (HPC/AI) virtual machines. This involves deep technical work on hardware/software interactions, device virtualization, and performance analysis of GPU workloads for large-scale AI training and inference. The role contributes to the underlying platform software and its exposure as an Azure service, with opportunities to work on upper layers of Azure infrastructure. | Serve | 7 |
| Senior Software Engineer The role focuses on designing and building cutting-edge networking infrastructure for large-scale AI training and inference in Azure Cloud. The goal is to enable breakthroughs in AI by delivering unmatched computational power, scalability, and reliability, with a focus on high performance, low latency, and minimal jitter for distributed AI workloads. | Serve | 7 |
| Software Engineer - Observability Software Engineer for Microsoft's Azure Data team, focusing on the Observability Platform. The role involves designing, developing, and operating large-scale telemetry ingestion pipelines that handle massive data volumes (Exabytes daily) and trillions of signals. Responsibilities include building APIs, integrating ML-based anomaly detection, ensuring reliability and scalability, and participating in on-call rotations. The platform underpins observability across Azure, Office, Windows, and Xbox. | Serve | 5 |
| Senior Software Engineer - Github Copilot API Senior Software Engineer on the Copilot API team at GitHub, focusing on building and maintaining scalable, reliable backend services and APIs that power GitHub Copilot features and integrations. The role involves designing, developing, and operating distributed systems with an emphasis on performance, reliability, and operational excellence. | Serve | 5 |
| Principal Group Engineering Manager - Microsoft Entra Principal Group Engineering Manager for Microsoft Entra, focusing on hyperscale platform challenges, availability of mission-critical services, and engineering systems for software serving over 1 billion people. The role involves leading managers and senior engineers to improve safe operation of hyperscale fleets using AI-driven operations for incident triage and resolution, correlating change logs with incident impact, architecting control planes with AI/telemetry for safe changes, and building runtimes for auto-triage and auto-mitigation of incidents. | Serve | 5 |
| Software Engineer II Software Engineer II role focused on building and scaling backend services for Microsoft AI Search Places, which uses geospatial knowledge and AI for location search. The role involves applying ML solutions to geospatial problems and experimenting with LLMs to improve system quality and efficiency. | Serve | 5 |
| Senior Software Engineer Senior Software Engineer on the Bing Multimedia team, focusing on building and evolving large-scale offline infrastructure for image and video search. The role involves designing, building, and operating scalable platforms that process vast amounts of data, with a strong emphasis on engineering metrics like latency, cost, availability, and quality. The team leverages state-of-the-art ML models and AI-assisted engineering practices to deliver world-class visual search experiences. | Serve | 5 |
| Senior Software Engineer Senior Software Engineer on the Surface Devices team responsible for designing, scaling, and maintaining CI/CD infrastructure for Windows OEM factory images. The role involves integrating Azure AI capabilities for intelligent log analysis, anomaly detection, and automation within the DevOps ecosystem. | Serve | 5 |
| Principal Software Engineer Principal Software Engineer to design, build, and operate core compute platform services for developers, enabling them to host and run various apps including AI Agent Apps at cloud scale. The role requires end-to-end technical leadership, hands-on component design and coding, and mentoring others on engineering and site reliability practices, with a focus on AI as a core building block. | Serve | 5 |
| Principal Software Engineer - CoreAI Principal Software Engineer for Microsoft's CoreAI Growth and Data Science team, focusing on data and analytics architecture, large-scale data pipelines, and leveraging AI to optimize workflows. The role involves cross-team collaboration, mentoring, and ensuring data governance and trust for AI workloads within the developer ecosystem. | Serve | 5 |
| Consultant A2 - Infra This role focuses on designing, building, and optimizing end-to-end cloud and on-premises infrastructure solutions, with a significant emphasis on supporting AI/ML workloads. The consultant will leverage Azure AI Services, containerized AI workloads, and integrate models into cloud environments, acting as a full-stack infrastructure consultant. | Serve | 5 |
| Sr Consultant - Infra Sr. Consultant focused on designing, building, and optimizing cloud and on-premises infrastructure solutions, with a specific emphasis on AI workloads. This role requires expertise in Azure AI Services, integrating frontier models, and managing AI developer tools as infrastructure components. The consultant will ensure secure, scalable, and high-performing environments for AI applications. | Serve | 5 |
| Principal Software Engineer Principal Software Engineer role focused on leading the architecture, design, and implementation of high-scale, low-latency services with an AI First approach within Microsoft's Identity engineering team. The role involves driving AI/ML-based engineering solutions, cloud environments (Azure), and large distributed systems, with a strong emphasis on security and reliability. | Serve | 5 |
| Software Engineer II Software Engineer II role focused on designing, developing, and optimizing networking infrastructure for large-scale AI training and inference in Azure Cloud. The role involves ensuring high performance, low latency, and minimal jitter for distributed AI workloads, working with cutting-edge networking hardware and software. | Serve | 5 |
| Senior Software Engineer- CTJ - Poly Senior Software Engineer to deliver secure, scalable, and mission critical AI infrastructure for Microsoft’s sensitive cloud environments, focusing on foundational services for Azure Machine Learning, Azure AI Services, Azure OpenAI, and Microsoft Foundry. The role involves building and operating AI native full stack systems, leveraging modern tooling and AI systems to accelerate development and enhance product quality within air gapped, sovereign, and commercial clouds. | Serve | 5 |
| Member of Technical Staff - Backend Engineer Backend Engineer for Microsoft Copilot, focusing on building and scaling the core backend platform including Orchestrator, Inference, and APIs to power AI-driven consumer experiences. The role involves developing secure, performant APIs, collaborating with cross-functional teams, and shipping high-quality code in a fast-paced environment. | ServeAgent | 5 |
| MTS - Site Reliability Engineer This role is for a Site Reliability Engineer (SRE) focused on ensuring the reliability, availability, and efficiency of large-scale distributed AI infrastructure. The SRE will work with ML researchers, data engineers, and product developers to operate platforms for training, fine-tuning, and serving generative AI models. Key responsibilities include maintaining uptime, designing observability systems, optimizing performance, building automation for deployments and incident response, and ensuring security and compliance in hybrid cloud/on-prem CPU+GPU environments. The role requires strong experience in SRE/DevOps, Kubernetes, CI/CD, public cloud platforms, monitoring tools, and programming languages like Python or Go, with a preference for experience with large-scale GPU clusters and HPC. | Serve | 5 |
| Software Engineer II Software Engineer II role focused on designing and building next-generation networking infrastructure for large-scale AI training and inference in Azure Cloud. The role involves developing high-performance, low-latency, and reliable networking capabilities to support distributed AI workloads, working at the intersection of AI and high-performance computing. | Serve | 5 |