Director, Research - Frontier Benchmarks

Snorkel AI Snorkel AI · Data AI · San Francisco, CA · 316 - Research

Director of Research at Snorkel AI, focusing on designing datasets to advance the frontier of AI. The role involves leading a team to identify data needs, design benchmarks, and collaborate with academic and industry partners. It sits at the intersection of research, product strategy, and go-to-market, with a strong emphasis on data-centric research and evaluation.

What you'd actually do

  1. Define the process and decide what data we build next by combining market signal from customers and GTM with your team’s read on where frontier labs are headed.
  2. Define and own the quality for benchmark design and support GTM in positioning it against existing datasets.
  3. Mentor and grow a technical team comfortable with x-functional and customer- facing work, setting a high standard for quality and velocity while staying hands-on yourself.
  4. Foster external and academic collaborations related to our areas of interest
  5. Stay at the frontier: Track new benchmarks, agentic evaluation, and RL training research and tie back to the dataset portfolio we focus on

Skills

Required

  • 8+ years in applied AI, ML, or research roles
  • experience leading or mentoring senior researchers
  • experience setting the roadmap and vision for teams
  • collaborating frequently with the executive team
  • operate cross-functionally
  • managing competing priorities
  • experience and interest in being customer-facing
  • translating technical contributions to business impact

Nice to have

  • Ph.D. in machine learning, NLP, or a related field preferred
  • equivalent industry or research lab experience considered

What the JD emphasized

  • designing the datasets that define and advance the frontier of AI
  • identify the data, skills and capabilities that will improve model performance
  • uncover gaps in current frontier models
  • design a wide variety of RL and domain specific benchmarks
  • building benchmarks
  • data-centric research
  • Track new benchmarks, agentic evaluation, and RL training research

Other signals

  • designing datasets that define and advance the frontier of AI
  • identify the data, skills and capabilities that will improve model performance
  • uncover gaps in current frontier models
  • design a wide variety of RL and domain specific benchmarks