Senior Software Engineer

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Software Engineering

Senior Software Engineer to build unified infrastructure for multimodal feature extraction from meetings (audio, video, screen shares) to derive semantic meaning, supporting real-time and deferred processing for AI-driven workflows and agents within Microsoft 365 Copilot and other services.

What you'd actually do

  1. You will be building a unified infrastructure using advanced models to derive semantic meaning from meetings, supporting real-time extraction where needed and deferred extraction where possible by utilizing idle CPU/GPU resources across IC3 and M365 Core.
  2. Our infrastructure will be designed for reuse across multimodal sessions with agents and CloudPC/CUA workloads, supporting scenarios for Digital Employee, W365A, Dynamics, and Researcher.
  3. By balancing real-time and offline processing, we can deliver richer meeting intelligence at a fraction of today’s cost, making it accessible and actionable for Copilot and agents throughout the Microsoft ecosystem.

Skills

Required

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C#, Python, Rust, Java, C, or C++

Nice to have

  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to C#, Python, Rust, Java, C, or C++
  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C#, Python, Rust, Java, C, or C++
  • Experience with cloud services or networking programming or large-scale server application.
  • Experience with prompt engineering, evaluation strategies, and model hosting best practices.

What the JD emphasized

  • advanced models
  • semantic meaning
  • real-time extraction
  • deferred extraction
  • multimodal sessions
  • agents
  • AI-driven workflows

Other signals

  • building infrastructure for multimodal feature extraction
  • deriving semantic meaning from meetings
  • supporting real-time extraction
  • integrating voice, video, screenshare, and artifacts in real time
  • powering intelligent collaboration and AI-driven workflows