Principal Product Manager, AI Model Security

Microsoft Microsoft · Big Tech · Redmond, WA +2 · Product Management

Product Manager for AI Model Security within Microsoft's Superintelligence Team, focusing on hardening frontier LLMs against security threats (prompt injection, jailbreaking, data exfiltration, etc.) and ensuring models deliver capabilities for real-world security workflows. This role involves defining the security roadmap, driving exploit defense, building red-teaming frameworks, partnering with security product teams, and shaping launch readiness, with a strong emphasis on understanding attacker perspectives and balancing capability with risk.

What you'd actually do

  1. Own the model security roadmap: Define and prioritize the security hardening strategy for our frontier models across the full OWASP LLM threat surface — prompt injection (direct and indirect), data exfiltration, jailbreak resistance, system prompt leakage, training data extraction, and adversarial manipulation of agentic workflows.
  2. Drive zero-day and exploit defense: Work with researchers to evaluate and mitigate the risk of models being used to generate zero-day exploits, malware, or novel attack vectors. Define thresholds, build evaluation datasets, and own the decision framework for what the model should and should not be capable of in the security domain.
  3. Build and scale red-teaming frameworks: Design, run, and iterate adversarial testing programs — both automated and human-driven — to continuously probe model vulnerabilities. Establish metrics (e.g., jailbreak success rate, injection bypass rate, exfiltration resistance) and drive measurable improvement over time.
  4. Partner with Microsoft Security product teams: Work closely with Azure Security and Security Copilot teams to translate their product requirements into model training priorities. Ensure our models are purpose-built for threat detection, incident triage, vulnerability assessment, log analysis, and compliance reasoning.
  5. Define security-specific model evaluations: Build benchmark suites and evaluation frameworks that measure real-world security usefulness — not just academic performance. Drive training data strategy to improve domain-specific model quality for security practitioners.

Skills

Required

  • Product management experience
  • Security engineering experience
  • Software development experience
  • Hands-on experience with AI/ML systems
  • Deep familiarity with LLM security threats
  • Experience defining product requirements
  • Experience driving decisions with researchers or ML engineers
  • Track record of building evaluation systems, security benchmarks, or adversarial testing frameworks
  • Ability to operate autonomously
  • Ability to make decisions with incomplete information
  • Ability to drive projects from ambiguity to shipped outcomes

Nice to have

  • Technical background in computer science
  • Technical background in security
  • Technical background in AI/ML
  • Postgraduate degree
  • Experience in offensive security
  • Experience in penetration testing
  • Experience in red teaming (applied to AI/ML systems)
  • Familiarity with security workflows and tooling (SIEM, SOAR, EDR, threat intelligence platforms)
  • Understanding of how practitioners use security tools

What the JD emphasized

  • hardened against the full spectrum of LLM security threats
  • OWASP LLM Top 10
  • security practitioners
  • security analysts and incident responders
  • model training priorities
  • evaluation benchmarks
  • product requirements
  • security hardening strategy
  • adversarial manipulation of agentic workflows
  • zero-day exploit generation
  • adversarial testing programs
  • security usefulness
  • security practitioners
  • security criteria for model launches
  • security dimension of go/no-go decisions
  • LLM security landscape
  • security considerations into model training
  • fine-tuning
  • RLHF
  • post-training safeguards
  • security engineering
  • LLM security threats
  • prompt injection
  • jailbreaking
  • data exfiltration
  • adversarial attacks on generative models
  • red-teaming
  • security research
  • evaluation systems
  • security benchmarks
  • adversarial testing frameworks
  • offensive security
  • penetration testing
  • red teaming
  • AI/ML systems

Other signals

  • AI model security
  • LLM security threats
  • adversarial attack
  • product management
  • evaluation frameworks
  • security benchmarks