Sr Staff R&d Engineer

Disney Disney · Media · Nicasio, CA +1

This role leads the research, design, and implementation of AI/ML technologies for audio intelligence in media production. The engineer will architect, build, and optimize machine learning systems, focusing on speech processing, style transfer, source separation, and generative audio synthesis. Responsibilities include driving scalable training pipelines, developing generative models, managing the end-to-end model lifecycle, and collaborating on integration into a SaaS environment.

What you'd actually do

  1. Lead the research, design, and implementation of state-of-the-art machine learning algorithms for speech processing, voice transfer, source separation, and upmixing in media post-production environments.
  2. Drive the architecture and deployment of scalable model training pipelines using PyTorch and distributed computing frameworks.
  3. Develop novel generative audio models, including latent diffusion, flow-based models, variational autoencoders, and neural vocoders, optimized for professional soundtrack production.
  4. Own end-to-end model lifecycle management: pretraining, fine-tuning, validation, inference optimization, and CI/CD integration.
  5. Guide the development of personalized model adaptation workflows to support per-user tuning, cross-project continuity, and flexible deployment.

Skills

Required

  • MSc or PhD in Computer Science, Electrical Engineering, Applied Math, or a related field with a focus on AI/ML and mult-imodal signal processing
  • 5 years of professional experience in applied ML
  • Expertise in building and scaling models using PyTorch
  • Fluency in training, fine-tuning, and inference for deep neural networks
  • Demonstrated experience developing generative models such as VAE, GAN, diffusion models, or neural vocoders
  • Deep understanding of audio-specific ML domains, including source separation, speech enhancement, music processing, and cross-modal tasks
  • Experience with MLOps tooling (e.g., Weights & Biases, MLflow, Datachain)
  • Docker-based containerization
  • Scalable infrastructure for distributed training
  • Fluency in audio signal processing fundamentals
  • Integration of DSP into ML pipelines
  • Proven ability to contribute to architectural planning, research strategy, and production deployment

Nice to have

  • Familiarity with audio/text/video multi-modal frameworks and cross-domain representations
  • Experience implementing real-time or near-real-time inference pipelines in cloud or edge environments
  • Working knowledge of latent diffusion audio models
  • Strong knowledge of industry-standard audio datasets and benchmarks
  • Experience optimizing inference pipelines for creative applications or interactive use
  • Proficiency in lower-level audio frameworks (C / C++)
  • Contributions to published research at top-tier conferences and/or open-source ML frameworks

What the JD emphasized

  • deep focus on audio-centric AI/ML research and deployment
  • Expertise in building and scaling models using PyTorch
  • Demonstrated experience developing generative models
  • Deep understanding of audio-specific ML domains
  • Proven ability to contribute to architectural planning, research strategy, and production deployment

Other signals

  • leading research and development of AI/ML technologies
  • architecting and deploying scalable model training pipelines
  • developing novel generative audio models
  • owning end-to-end model lifecycle management
  • contributing to technical direction and production-ready solutions