Sr. Data Scientist, Responsible AI

Pinterest Pinterest · Consumer · San Francisco, CA · ATG

This role focuses on designing and building data science foundations for automated adversarial testing of Generative AI products at Pinterest. The primary goal is to identify vulnerabilities, develop evaluation frameworks, and create harm-detection methodologies to ensure product safety, policy compliance, and user trust. The role involves working cross-functionally with ML Engineers, Trust & Safety, Policy, and Product Managers to proactively mitigate risks in GenAI experiences.

What you'd actually do

  1. Design and develop automated adversarial testing methodologies — including single-turn, multi-turn, and multimodal attack strategies — to proactively identify vulnerabilities in Pinterest's Generative AI products.
  2. Build and calibrate hybrid evaluation pipelines combining LLM-based judges, classifiers, and rule-based systems to accurately detect safety violations, policy breaches, bias, and representational harms.
  3. Develop and operationalize harm taxonomies grounded in industry standards and Pinterest's Responsible AI and Trust & Safety threat models.
  4. Design adaptive refinement loops that learn from attack outcomes (near-misses, partial failures) to iteratively surface deeper and previously unknown vulnerabilities.
  5. Bring scientific rigor and statistical methods to the evaluation of AI safety — including benchmark dataset construction, evaluation calibration, and success-metric definition (vulnerability severity, coverage breadth, pre-launch risk reduction).

Skills

Required

  • 5+ years of experience analyzing data in a fast-paced, data-driven environment with proven ability to apply scientific methods to solve real-world problems on web-scale data.
  • Strong interest and hands-on experience in one or more of: AI safety, adversarial machine learning, red teaming, responsible AI, or trust & safety.
  • Deep familiarity with large language models (LLMs), generative AI systems, and their failure modes — including prompt injection, jailbreaks, bias, and safety violations.
  • Experience designing and calibrating evaluation frameworks for AI systems — including LLM-as-judge, classifier-based evaluation, and benchmark dataset construction.
  • Strong quantitative programming (Python) and data manipulation skills (SQL/Spark); experience with ML pipelines and large-scale experimentation.
  • Ability to work independently, drive ambiguous projects end-to-end, and operate with high ownership.
  • Excellent written and verbal communication skills, with the ability to explain complex technical findings to both technical and non-technical partners.
  • A team player eager to partner across Responsible AI, Trust & Safety, Product, Engineering, Policy, and Legal to turn safety insights into action.

Nice to have

  • Familiarity with AI safety taxonomies and frameworks (e.g., OWASP LLM Top 10, MITRE ATLAS) is strongly preferred.

What the JD emphasized

  • automated adversarial testing
  • evaluation frameworks
  • harm taxonomies
  • adaptive refinement loops
  • AI safety

Other signals

  • responsible AI mandate
  • automated adversarial testing
  • harm-detection methodologies
  • generative AI vulnerabilities
  • product safety