What you'd actually do

Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently, abstracting away the complexity of "ETL" into self-service infrastructure.

Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures, utilizing modern streaming architecture to reduce latency for AI inference.

Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets.

AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation.

Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform) rather than manual configuration.

Skills

Required

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience.

Nice to have

Bachelor's or Master's Degree in Computer Science, Software Engineering, or related technical field.
4+ years of experience in Software Engineering or Data Infrastructure.
Proficiency in Python, Scala, Java, or Go. You write production-grade application code with unit tests, CI/CD, and modular design.
Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake). You should understand internals like query planning, memory management, and distributed consistency.
Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi).
Experience building internal developer platforms or "Data-as-a-Service" APIs.
Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing.
Experience with container orchestration (Kubernetes) for deploying data applications.
Experience enabling AI/ML workloads (Feature Stores, Vector Databases).

What the JD emphasized

Systems Builders

architect the backbone of Microsoft Copilot

build the "Paved Road" for AI

processing petabytes of data for the world's most advanced AI models

architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets

Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation

Other signals

processing petabytes of data for the world's most advanced AI models

architect the backbone of Microsoft Copilot

build the "Paved Road" for AI

transforms raw, massive-scale signals into the fuel that powers training, inference, and evaluation for millions of users

architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets

Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation

Overview

If you are excited by the challenge of designing distributed systems that process petabytes of data for the world's most advanced AI models, this is your team. We are not looking for someone to just write queries or maintain legacy pipelines. We are looking for Systems Builders—engineers who understand the internals of distributed compute, who treat data infrastructure as a product, and who want to architect the backbone of Microsoft Copilot.

Join us to build the "Paved Road" for AI. You will own the platform that transforms raw, massive-scale signals into the fuel that powers training, inference, and evaluation for millions of users. We need someone who is energized by solving hard problems in stream processing, lakehouse architecture, and developer experience.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

By applying to this U.S. Mountain View, CA, Redmond WA position, you are required to be local to the San Francisco area , Redmond area and in office 3 days a week.

Starting January 26, 2026, MAI employees are expected to work from a designated Microsoft office at least four days a week if they live within 50 miles (U.S.) or 25 miles (non-U.S., country-specific) of that location. This expectation is subject to local law and may vary by jurisdiction.

Responsibilities

As a Member of Technical Staff - Data Platform, you will will be responsible for the following:

Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently, abstracting away the complexity of "ETL" into self-service infrastructure.
Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures, utilizing modern streaming architecture to reduce latency for AI inference.
Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets.
AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation.
Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform) rather than manual configuration.
Data Reliability Engineering: Move beyond simple "validation checks" to build automated governance and observability systems that detect anomalies in the data mesh before they impact downstream models.
Compute Optimization: Deep-dive into query execution plans and cluster performance. Optimize shuffle operations, partition strategies, and resource allocation to ensure our platform is as cost-efficient as it is fast.

Qualifications

Required Qualifications

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience.

Preferred Qualifications

Bachelor's or Master's Degree in Computer Science, Software Engineering, or related technical field.
4+ years of experience in Software Engineering or Data Infrastructure.
Proficiency in Python, Scala, Java, or Go. You write production-grade application code with unit tests, CI/CD, and modular design.
Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake). You should understand internals like query planning, memory management, and distributed consistency.
Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi).
Experience building internal developer platforms or "Data-as-a-Service" APIs.
Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing.
Experience with container orchestration (Kubernetes) for deploying data applications.
Experience enabling AI/ML workloads (Feature Stores, Vector Databases).

Data Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Data Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**

Overview

By applying to this U.S. Mountain View, CA, Redmond WA position, you are required to be local to the San Francisco area , Redmond area and in office 3 days a week.

Responsibilities

As a Member of Technical Staff - Data Platform, you will will be responsible for the following:

Core Platform Engineering: Design and build the underlying frameworks (based on Spark/Databricks) that allow internal teams to process massive datasets efficiently, abstracting away the complexity of "ETL" into self-service infrastructure.
Distributed Systems Architecture: Modernize our data stack by moving from batch-heavy patterns to event-driven architectures, utilizing modern streaming architecture to reduce latency for AI inference.
Unstructured AI Data Pipelines: Architect high-throughput pipelines capable of processing complex, non-tabular data (documents, code repositories, chat logs) for LLM pre-training, fine-tuning and evaluations datasets.
AI Feedback Loops: Engineer the high-throughput telemetry systems that capture user interactions with Copilot, creating the critical data loops required for Reinforcement Learning and model evaluation.
Infrastructure as Code: Treat the data platform as software. Define and deploy all storage, compute, and networking resources using IaC (Bicep/Terraform) rather than manual configuration.
Data Reliability Engineering: Move beyond simple "validation checks" to build automated governance and observability systems that detect anomalies in the data mesh before they impact downstream models.
Compute Optimization: Deep-dive into query execution plans and cluster performance. Optimize shuffle operations, partition strategies, and resource allocation to ensure our platform is as cost-efficient as it is fast.

Qualifications

Required Qualifications

Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 3+ years experience in business analytics, data science, software development, data modeling, or data engineering OR Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4+ years experience in business analytics, data science, software development, data modeling, or data engineering OR equivalent experience.

Preferred Qualifications

Bachelor's or Master's Degree in Computer Science, Software Engineering, or related technical field.
4+ years of experience in Software Engineering or Data Infrastructure.
Proficiency in Python, Scala, Java, or Go. You write production-grade application code with unit tests, CI/CD, and modular design.
Deep Distributed Systems Knowledge: Demonstrated technical understanding of massive-scale compute engines (e.g., Apache Spark, Flink, Ray, Trino, or Snowflake). You should understand internals like query planning, memory management, and distributed consistency.
Experience architecting Lakehouse environments at scale (using Delta Lake, Iceberg, or Hudi).
Experience building internal developer platforms or "Data-as-a-Service" APIs.
Strong background in streaming technologies (Kafka, Azure EventHubs, Pulsar) and stateful stream processing.
Experience with container orchestration (Kubernetes) for deploying data applications.
Experience enabling AI/ML workloads (Feature Stores, Vector Databases).

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.