What you'd actually do

Design and build scalable data pipelines (batch and streaming) to process large volumes of security and operational data.

Design and evolve data models, schemas, and storage strategies for analytics and AI use cases.

Implement data validation, quality checks, and observability frameworks to ensure data accuracy and reliability.

Enable high-quality datasets for AI/ML teams, including support for feature pipelines and training data preparation.

Partner with engineering, data science, and product teams to deliver end-to-end data solutions.

Skills

Required

Bachelor's Degree in Computer Science or related technical field
2+ years of software, data, or related engineering experience
coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
Ability to meet Microsoft, customer, and/or government security screening requirements

Nice to have

Hands-on experience building or maintaining data pipelines and distributed data systems
Programming experience in Python, Scala, or SQL for data processing and pipeline development
Understanding of data modeling, ETL processes, and large-scale data processing concepts

Other signals

data foundations that power AI-native security systems

transform raw signals ... into high-quality, trusted datasets for analytics and AI-driven insights

enabling downstream AI models

data is structured for downstream ML pipelines, feature engineering, and analytics workloads

Enable high-quality datasets for AI/ML teams

Collaborate with AI engineers to ensure data is optimized for RAG pipelines, model training, and evaluation workflows

Overview

Security represents the most critical priorities for our customers in a world awash in digital threats, regulatory scrutiny, and estate complexity. Microsoft Security aspires to make the world a safer place for all. We want to reshape security and empower every user, customer, and developer with a security cloud that protects them with end to end, simplified solutions. The Microsoft Security organization accelerates Microsoft’s mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers’ heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme of inspiring excellence, and encouraging teams and leaders to bring their best each day. In doing so, we create life-changing innovations that impact billions of lives around the world.

Microsoft Security's** Getting Customers Ready for AI team** is seeking a Data Engineer II to help build the data foundations that power AI-native security systems.

In this role, you will design and develop scalable data platforms, pipelines, and telemetry systems that transform raw signals across identity, devices, data, applications, and infrastructure into high-quality, trusted datasets for analytics and AI-driven insights.

You will work in a collaborative engineering environment to deliver reliable, secure, and performant data systems, enabling downstream AI models, security detections, and customer-facing experiences. This role is ideal for engineers looking to deepen their expertise in large-scale data systems, distributed processing, and modern data platform architectures for AI and Agents.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

**Data Platform & Pipeline Engineering **

Design and build scalable data pipelines (batch and streaming) to process large volumes of security and operational data.
Develop and optimize ETL/ELT workflows that transform raw telemetry into structured, consumable datasets.
Implement data ingestion frameworks to integrate multi-source data from services, APIs, and event streams.
Improve pipeline performance, reliability, and efficiency through partitioning, indexing, and optimization techniques.

**Data Modeling & Storage Systems **

Design and evolve data models, schemas, and storage strategies for analytics and AI use cases.
Work with distributed storage systems (e.g., data lakes, warehouses) to ensure scalability and cost efficiency.
Maintain data partitioning, retention, and lifecycle strategies aligned to business and compliance needs.
Ensure data is structured for downstream ML pipelines, feature engineering, and analytics workloads.

**Data Quality, Governance & Security **

Implement data validation, quality checks, and observability frameworks to ensure data accuracy and reliability.
Apply best practices for data governance, lineage, and auditing across pipelines and datasets.
Ensure compliance with security, privacy, and regulatory requirements when handling sensitive data.
Contribute to standardization of data contracts and schemas across services.

**AI Readiness & Intelligence Enablement **

Enable high-quality datasets for AI/ML teams, including support for feature pipelines and training data preparation.
Collaborate with AI engineers to ensure data is optimized for RAG pipelines, model training, and evaluation workflows.
Build and maintain telemetry pipelines and metrics systems that provide insights into AI system performance and usage.
Support development of data-driven signals and insights that improve customer security posture.

**Collaboration & Execution **

Partner with engineering, data science, and product teams to deliver end-to-end data solutions.
Contribute to system design discussions, architecture reviews, and cross-team integration efforts.
Work across teams to ensure consistent data definitions and interoperability across platforms.

**Engineering Excellence **

Write clean, maintainable, and well-tested code for data pipelines and platform services.
Build monitoring and alerting mechanisms for pipeline health, latency, and data quality issues.
Document data architecture, pipeline designs, and operational practices for knowledge sharing and scalability.
Follow best practices for performance optimization, cost management, and reliability engineering.
Embody our culture and values.

Qualifications

**Required Qualifications **

Bachelor's Degree in Computer Science or related technical field AND 2+ years of software, data, or related engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience

**Other Requirements: **Ability to meet Microsoft, customer, and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years.

Preferred Qualifications:

Hands-on experience building or maintaining data pipelines and distributed data systems.
Programming experience in Python, Scala, or SQL for data processing and pipeline development.
Understanding of data modeling, ETL processes, and large-scale data processing concepts.
Experience with big data frameworks (e.g., Spark, Flink) or streaming systems (e.g., Kafka/Event Hub).
Familiarity with cloud-based data platforms (Azure preferred, e.g., ADLS, Synapse, Databricks).
Exposure to data governance, security, and compliance practices in enterprise environments.
Experience supporting AI/ML workloads through data pipelines, feature stores, or training datasets.
Developed problem-solving skills with the ability to operate in ambiguity and deliver incrementally.

#MSFTSecurity

Software Engineering IC3 - The typical base pay range for this role across the U.S. is USD $102,100 - $202,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $133,800 - $219,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**