Principal Tpm Data & Telemetry - Windows Reliability

Microsoft Microsoft · Big Tech · Redmond, WA +1 · Technical Program Management

This role focuses on managing and improving data pipelines and telemetry for Windows reliability. It involves ensuring data quality, building executive-ready reports, and collaborating with engineering and partner teams to translate reliability signals into actionable decisions. The role emphasizes operational rigor, process improvement, and building team resilience through documentation and training.

What you'd actually do

  1. Own Reliability Telemetry “Run-the-System” Operations
  2. Deliver Executive-Ready Reliability Reporting & Insights
  3. Partner Deeply Across Engineering and Ecosystem Stakeholders
  4. Lead Programs That Improve Reliability Signal Quality and Actionability
  5. Build Team Resilience and Depth

Skills

Required

  • Bachelor's Degree AND 6+ years’ experience in engineering, product/technical program management, data analysis, or product development OR equivalent experience.

Nice to have

  • 3+ years of experience managing cross-functional and/or cross-team projects.
  • 7+ years of experience in one or more of: program management, data/analytics engineering, reliability engineering, telemetry operations, or product analytics.
  • Demonstrated experience owning end-to-end telemetry/analytics systems (ingestion → validation → modeling → dashboards → operational consumption).
  • Solid skills in data querying and analysis (e.g., Kusto/ADX, SQL, equivalent large-scale log analytics).
  • Experience building decision-grade reporting (e.g., Power BI or equivalent) and communicating insights to senior stakeholders.
  • Proven ability to drive cross-functional execution: aligning stakeholders, assigning ownership, and delivering outcomes through ambiguity.
  • Operational excellence mindset: quality bars, monitoring, incident management, documentation, and continuous improvement.
  • Familiarity with Windows reliability concepts (crash telemetry, drivers, servicing, regressions, device cohorts).
  • Experience with large-scale cloud data platforms (Azure data ecosystem, distributed pipelines, identity resolution).
  • Ability to automate analysis/reporting (Python, C#, Spark, data pipelines, workflow orchestration).
  • Prior experience working with hardware + software ecosystem partners (OEMs, IHVs, silicon vendors) or device quality programs.
  • Experience defining metrics/governance: semantic layers, taxonomy, standard definitions, and “single source of truth” design.
  • Comfort operating in a fast-paced environment with multiple stakeholders and shifting priorities.
  • Solid written communication skills (executive-ready narrativ

What the JD emphasized

  • reliability telemetry
  • data quality
  • operational rigor
  • actionable decisions
  • stakeholders