Director, Multimodal Data Research

Adobe Adobe · Enterprise · San Jose, CA

Director to lead Multimodal Data Research organization responsible for scaling, quality, and innovation in multimodal training data (image, video, audio) for Adobe Firefly's foundational generative and editing models. The role involves setting vision, leading multiple senior teams, and orchestrating execution across data ingestion, filtering, captioning, generation, editing-centric datasets, and data-driven training experiments.

What you'd actually do

  1. Set and own the long‑term vision and roadmap for multimodal data research across image, video, and audio.
  2. Define clear investment priorities across data quality, coverage, scalability, and iteration speed to support Firefly’s foundational and editing‑centric models.
  3. Build and evolve an organization that balances research innovation with operational excellence in data delivery.
  4. Lead multiple teams spanning multimodal data quality and scaling, captioning and prompt rewriting, data frameworks and infrastructure, and systematic training ablations to advance large‑scale pretraining and editing datasets.
  5. Establish org‑wide quality bars, taxonomies, and evaluation signals for multimodal training data.

Skills

Required

  • 10+ years of experience in machine learning, data systems, or applied research
  • Significant exposure to large-scale ML training data
  • Deep understanding of how data quality, composition, and labeling affect generative and editing model behavior
  • Experience working with multimodal data (image, video, audio) and large-scale data pipelines
  • Strong intuition for balancing research exploration with production constraints
  • Proven experience leading multi-team organizations in technically complex environments
  • Ability to set long-term vision while driving near-term execution
  • Strong cross-functional leadership skills, with experience influencing research, product, infrastructure, and executive stakeholders
  • Excellent communicator, able to synthesize complex technical and organizational tradeoffs

Nice to have

  • Master’s degree or Ph.D. in Computer Science, Machine Learning, Data Science, or a related field

What the JD emphasized

  • multimodal training data
  • large-scale ML training data
  • multimodal data (image, video, audio)
  • data quality, composition, and labeling
  • generative and editing model behavior
  • multimodal data
  • large-scale data pipelines
  • multi-team organizations

Other signals

  • scaling multimodal training data
  • foundational generative and editing models
  • data foundation for multimodal intelligence
  • data-driven training experiments
  • next-generation data frameworks