Senior Director - Genai Data Strategy

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

Senior Director role focused on defining and executing a comprehensive data strategy for foundation models, encompassing multi-modal data acquisition, curation, synthetic generation, and alignment techniques like RLHF. This role bridges research insights with data collection to improve model performance and safety, and engages with customers to translate deployment gaps into data priorities.

What you'd actually do

  1. Define and evolve the end-to-end roadmap for multi-lingual data acquisition across various modalities (text, vision, audio, etc.) to train and evaluate large-scale AI models.
  2. Collaborate with research teams to interpret model edge cases and failure modes. Use these insights to refine data collection logic, crafting a continuous loop where model output informs the next phase of data acquisition.
  3. Manage an external ecosystem of data partners and vendors.
  4. Lead the strategy for high-quality human data collection, including RLHF (Reinforcement Learning from Human Feedback), SFT (Supervised Fine-Tuning), and complex "human-in-the-loop" workflows to ensure model safety and alignment.
  5. Orchestrate the synthetic data stream used to bootstrap and amplify model fine-tuning and evaluation techniques, effectively bridging the gap between real-world scarcity and infinite scale.

Skills

Required

  • AI/ML sector experience
  • People management experience
  • LLM/VLM architectures
  • RLHF
  • RLAIF
  • Data pipelines
  • Diverse modalities
  • Customer empathy
  • Data acquisition
  • Data curation
  • Synthetic data generation
  • Model alignment
  • Human-in-the-loop workflows

Nice to have

  • Python
  • SQL
  • Spark
  • Scale.ai
  • Labelbox

What the JD emphasized

  • 18+ overall years of experience in product management or data operations specifically within the AI/ML sector at a technology company.
  • 8+ years direct people management experience
  • Deep understanding of LLM/VLM architectures, training regimes, and alignment methods like RLHF and RLAIF.
  • A "Full Stack" data perspective, with a proven track record of managing large-scale data pipelines and dealing with diverse modalities.

Other signals

  • Define and evolve the end-to-end roadmap for multi-lingual data acquisition across various modalities (text, vision, audio, etc.) to train and evaluate large-scale AI models.
  • Collaborate with research teams to interpret model edge cases and failure modes. Use these insights to refine data collection logic, crafting a continuous loop where model output informs the next phase of data acquisition.
  • Lead the strategy for high-quality human data collection, including RLHF (Reinforcement Learning from Human Feedback), SFT (Supervised Fine-Tuning), and complex "human-in-the-loop" workflows to ensure model safety and alignment.
  • Orchestrate the synthetic data stream used to bootstrap and amplify model fine-tuning and evaluation techniques, effectively bridging the gap between real-world scarcity and infinite scale.