Principal Scientist, Generation Data Architect (image & Video & Audio)

Adobe Adobe · Enterprise · San Jose, CA +1

Principal Scientist to shape the data strategy for Adobe Firefly's multimodal foundation models (image, video, audio), focusing on sourcing, curation, and training large-scale visual data to improve generative model quality, controllability, and realism for millions of users. This senior IC role involves cross-functional collaboration with research, engineering, and product teams.

What you'd actually do

  1. Develop a long-term approach to generation-focused data that improves image, video, and audio synthesis at scale
  2. Guide decisions around data quality, diversity, and composition for foundation model training
  3. Explore new methods to address gaps in current data approaches and improve model performance
  4. Work closely with model teams to align data design with model architecture and training behavior
  5. Build and refine data curricula across training stages (what data is used, when, and at what scale)

Skills

Required

  • ML
  • data systems
  • AI research
  • large-scale or foundation models
  • data strategies
  • generative systems
  • multimodal systems
  • image generation
  • video generation
  • audio generation
  • model behavior
  • data, model architecture, and training dynamics interaction
  • communication
  • collaboration

Nice to have

  • Ph.D. in Computer Science, Machine Learning, or a related field

What the JD emphasized

  • 10+ years of experience in ML, data systems, or AI research, including work on large-scale or foundation models
  • Experience shaping data strategies, representations, or training approaches for generative or multimodal systems
  • Expertise in image and/or video and audio generation, with a strong understanding of how data affects model behavior

Other signals

  • multimodal foundation models
  • generative models
  • data strategy for AI