Strategic Projects Lead, Red Team

Scale AI Scale AI · Data AI · New York, NY +1 · Gen AI Operations

Scale AI is seeking a Strategic Projects Lead for their Red Team and Safety function. This role focuses on managing partnerships with frontier AI model developers, stress-testing AI models, and shaping their deployment. The lead will act as a subject-matter expert, coordinate delivery with research and operations, and contribute to public benchmark launches. The role requires technical curiosity, operational rigor, and strong communication skills to bridge technical and commercial audiences.

What you'd actually do

  1. Own a portfolio of frontier-lab partnerships day-to-day. Run customer conversations, scope new engagements, and grow accounts from one-off projects into ongoing scope.
  2. Act as a credible subject-matter expert in customer conversations; explain adversarial methodology, vulnerability taxonomies, benchmark findings, and what they imply commercially, in language that lands with technical buyers.
  3. Partner with Enterprise and Public Sector account teams to qualify and close opportunities where safety, red teaming, or LLM-based cyber security are part of the deal.
  4. Coordinate delivery with research and operations leads. Make sure scope, capacity, and timelines line up before we commit, and unblock the team as engagements run.
  5. Contribute to public benchmark launches. Help shape the framing, the rollout, and the narrative we take to customers and the field.

Skills

Required

  • Working fluency in model behavior, adversarial ML, and AI safety
  • Experience managing technical accounts or partnerships
  • Strong written and verbal communication
  • Ability to translate between research and commercial audiences
  • Operational rigor
  • Comfort with ambiguity and a bias toward action

Nice to have

  • Experience managing technical accounts or partnerships, ideally with frontier AI labs, large enterprises, or federal agencies.
  • Genuine interest in AI safety, evidenced in prior work, writing, or research.
  • 2+ years of relevant experience.
  • Hands-on red teaming, penetration testing, or adversarial ML experience.
  • Background at a frontier AI lab, AI policy organization, or national security agency.
  • Experience contributing to public benchmarks, evaluations, or research publications.
  • Advanced degree in a relevant technical field.

What the JD emphasized

  • stress-tests the most capable AI models
  • partner closely with research, operations, and go-to-market
  • stress-tests frontier AI
  • generate adversarial data, run evaluations, and publish benchmarks
  • jailbreaks, prompt injection, agentic misuse, LLM-based cyber security, CBRN, political bias, and self-harm
  • shapes how leading labs train their models and how governments and enterprises deploy them
  • technical buyers
  • safety, red teaming, or LLM-based cyber security
  • scope, capacity, and timelines
  • public benchmark launches
  • roadmap
  • Working fluency in model behavior, adversarial ML, and AI safety
  • hold a substantive conversation with researchers and ML engineers
  • technical accounts or partnerships
  • frontier AI labs, large enterprises, or federal agencies
  • AI safety
  • translate between research and commercial audiences
  • Operational rigor
  • bias toward action
  • Hands-on red teaming, penetration testing, or adversarial ML experience
  • frontier AI lab, AI policy organization, or national security agency
  • public benchmarks, evaluations, or research publications

Other signals

  • stress-tests the most capable AI models
  • shapes how labs, governments, and enterprises deploy them
  • partner with frontier model developers
  • generate adversarial data, run evaluations, and publish benchmarks