Genai Strategic Projects Lead, Public Sector

Scale AI Scale AI · Data AI · Washington, DC · Public Sector

Scale AI is seeking a GenAI Strategic Projects Lead for their Public Sector team in Washington, DC. This role will own high-impact projects focused on generative AI data labeling pipelines, operational processes for data workforce management, and improving training/evaluation dataset quality for public sector customers, particularly in national security. The role involves developing infrastructure, managing data production pipelines, partnering with SMEs and customers, and influencing cross-organizational strategy to enable mission-critical AI applications.

What you'd actually do

  1. Develop, build, and maintain the infrastructure required to ensure data pipelines are efficient, scalable, and produce high-quality outputs
  2. Take ownership of day-to-day progress on high-priority data production pipelines, ensuring projects move forward efficiently
  3. Partner with subject matter experts in their fields to validate the quality of our data and to translate deep domain knowledge into scalable processes and measurable outcomes
  4. Work closely with customers to understand their requirements and design data taxonomies that optimize model performance.
  5. Own larger and larger components of our data delivery processes, until you ultimately serve as the full owner of our most visible and high impact customer pipelines

Skills

Required

  • 5+ years of experience in product development, data science, or operations
  • history of successful project management
  • comfort in ambiguity
  • Ability to analyze complex operational data, build queries, and identify trends to inform decisions and optimize processes
  • Technical aptitude to understand how to produce data for state of the art post-training techniques such as supervised fine tuning (SFT), reinforcement learning through human feedback (RLHF), Reinforcement Learning with Verifiable Rewards (RLVR) etc

Nice to have

  • Experience working in defense tech and/or an AI company
  • A technical degree in fields like computer science, data science, or engineering
  • A deep understanding of ML operations for generative AI workflows / products
  • An active Top Secret security clearance

What the JD emphasized

  • world-class training and test and evaluation data for Large Language Models for our Public Sector customers
  • build Generative AI data-labeling pipelines from the ground up
  • create operational processes to manage and optimize an in-house expert data workforce
  • develop novel technology-driven approaches to improve the quality of our training and evaluation datasets
  • partner directly with our internal machine learning experts and external stakeholders
  • ensure our data enables the development of mission-critical applications of AI
  • post-training techniques such as supervised fine tuning (SFT), reinforcement learning through human feedback (RLHF), Reinforcement Learning with Verifiable Rewards (RLVR)

Other signals

  • Generative AI data-labeling pipelines
  • operational processes to manage and optimize an in-house expert data workforce
  • novel technology-driven approaches to improve the quality of our training and evaluation datasets
  • partner directly with our internal machine learning experts and external stakeholders
  • data enables the development of mission-critical applications of AI