Inference Specialist, Creative Technology - Interpositive

Netflix Netflix · Big Tech · Los Angeles, CA +1 · Content Production Operations

This role focuses on operating and supporting custom generative AI inference workflows for creative projects at Netflix. The specialist will run, monitor, and troubleshoot GPU-based inference jobs, prepare and validate inputs, tune inference parameters, debug generation issues, and maintain repeatable launch workflows. They will partner with researchers and engineers to test new models and translate experimental capabilities into production practices, ultimately owning quality control for generated outputs and bridging communication between creative, production, research, and engineering teams.

What you'd actually do

  1. Operate and support custom generative AI inference workflows across a wide variety of film and series projects
  2. Run, monitor, and troubleshoot GPU-based inference jobs across local workstations, cloud infrastructure, and/or cluster environments, including distributed multi-GPU runs
  3. Prepare and validate inputs for model inference, including video, image, audio, masks, conditioning assets, prompts, metadata, and configuration files
  4. Tune inference parameters in collaboration with Creative Technology leadership, artists, researchers, and engineers to achieve production-quality results
  5. Debug failed or degraded runs by inspecting logs, outputs, configs, model checkpoints, data shapes, masks, frame ranges, codecs, GPU utilization, and environment issues

Skills

Required

  • 4+ years of relevant experience in machine learning production, VFX technology, post-production engineering, creative technology, technical direction, or a closely related technical production role
  • Hands-on experience running GPU-based model inference for image, video, audio, or multimodal generative AI systems
  • Experience working with Python-based ML codebases and command-line workflows in Linux environments
  • Experience debugging production runs using logs, stack traces, configuration files, model inputs, and generated outputs
  • Working knowledge of deep learning inference concepts, including checkpoints, schedulers or samplers, seeds, precision, batching, conditioning, and GPU memory constraints
  • Experience with video and image production formats, including frame sequences, ProRes, H.264/H.265, EXR, PNG, MP4/MOV containers, resolution handling, frame rates, and colorspace considerations
  • Experience coordinating technical work across creative, production, research, and engineering stakeholders
  • Demonstrated ability to operate effectively in a fast-moving R&D environment where tools, models, and workflows change frequently

Nice to have

  • Strong practical understanding of generative AI inference workflows, especially for video, image, audio, or multimodal models
  • Comfort working in Linux shells, Python environments, Git repos, config files, logs, and GPU infrastructure
  • Strong debugging instincts: able to isolate whether a problem is data, model, environment, code, infrastructure, or user configuration
  • Ability to reason about video and tensor fundamentals, including frame counts, aspect ratios, spatial resolution, temporal alignment, masks, channels, and batch dimensions
  • Experience with tools and libraries commonly used in production ML workflows, such as PyTorch, CUDA, ffmpeg, OpenCV, NumPy, safetensors, and distributed launch tools
  • Comfort with job schedulers, cloud GPU environments, or cluster workflows; Slurm experience is a strong plus
  • Careful eye for generated output quality, including temporal artifacts, mask errors, motion issues, color shifts, compression problems, and sync problems
  • Able to balance creative iteration speed with technical rigor, reproducibility, and clear communication
  • Self-directed and ownership-minded; comfortable seeing a messy problem, creating a path through it, and pulling in help when needed
  • Collaborative and calm under pressure, especially when supporting time-sensitive creative reviews or production deadlines
  • Strong written communication, including the ability to document workflows, summarize test results, and explain technical findings to non-technical partners
  • Comfort with ambiguity, rapidly changing tools, and incomplete information
  • Genuine interest in tooling for filmmakers, with the curiosity to engage deeply with both the creative possibilities and the engineering realities of the work

What the JD emphasized

  • production-quality results
  • experimental model capabilities into usable production practices
  • GPU-based model inference
  • Python-based ML codebases
  • Linux environments
  • debugging production runs
  • deep learning inference concepts
  • video and image production formats

Other signals

  • model inference workflows
  • generative AI
  • GPU-based inference
  • production-quality results
  • experimental model capabilities into usable production practices