Senior Applied Scientist

Adobe Adobe · Enterprise · San Jose, CA

Senior Applied Scientist at Adobe focused on improving the quality and controllability of generative multimodal models (images and videos). The role involves designing and implementing end-to-end training pipelines for foundational models, leading core development in pre-training areas, and developing scalable workflows for data curation and distributed training. Collaboration with research, data, evaluation, infrastructure, pre-training, and post-training teams is key to enhancing editing quality, instruction-following, visual fidelity, and edit consistency.

What you'd actually do

  1. Design and implement end-to-end training pipelines to build foundational model for both images and videos.
  2. Lead core development for specific pre-training areas (e.g., text to image and text to video), while aligning with broader team strategy.
  3. Develop scalable workflows for data curation, data quality improvements, and distributed training.
  4. Partner closely with research, data, evaluation, infrastructure, pre-training and post-training teams to push the editing quality for both images and videos.
  5. Closely collaborate with both pre-training and post-training team to understand the model’s capability and limitations to propose actionable solutions to improve quality.

Skills

Required

  • Ph.D. in Computer Science, Machine Learning, or a related field preferred
  • Proven track record in pre-training of large-scale multimodal models, specifically on cross modality for image and video data
  • Deep understanding of pre-training for multimodal generative models
  • Strong expertise in Vision-Language Models (VLMs), including experience with contrastive learning, multimodal alignment, and leveraging VLM-based encoders to improve semantic understanding in generative tasks
  • Deep understanding of modern diffusion-based architectures (DiT)
  • Ability to design and implement scalable pipelines for data curation, data quality control, and distributed training in collaboration with data and infrastructure teams
  • Experience optimizing model inference and deployment for high-throughput product environments, ensuring a balance between generative quality and computational efficiency
  • strong publications experience
  • previous industry level intern experience

What the JD emphasized

  • Proven track record in pre-training of large-scale multimodal models
  • Deep understanding of pre-training for multimodal generative models
  • strong publications experience

Other signals

  • design and implement end-to-end training pipelines
  • lead core development for specific pre-training areas
  • improve instruction-following, visual fidelity, and edit consistency