Principal Machine Learning Engineer, Communication Safety

Roblox Roblox · Consumer · San Mateo, CA · Machine Learning

Roblox is seeking a Principal Machine Learning Engineer for Safety AI Systems to define the technical strategy and execution roadmap for ML solutions in content and communication safety. This role involves leading a team, architecting large-scale systems for proactive moderation using LLMs and multimodal AI, and ensuring the safe deployment of innovative technologies to combat critical harms on the platform.

What you'd actually do

  1. Define and lead the multi-year technical vision, architectural strategy, and execution for machine learning solutions across Content and Communication Safety, ensuring these systems proactively and effectively detect and prevent high-severity, critical harms at massive scale.
  2. Act as the highest technical authority for the Content Safety ML domain, guiding the architecture and long-term maintainability of foundational models, data pipelines, and real-time inference services.
  3. Identify and champion the most ambiguous, high-leverage technical problems, driving alignment and securing investment for organization-wide ML infrastructure and platform development initiatives that benefit all of Trust & Safety.
  4. Oversee the adoption and safe deployment of innovative technologies (e.g., advanced NLP, self-supervised learning, multimodal LLMs) to anticipate and mitigate novel abuse vectors, moving beyond reactive detection to proactive intervention.
  5. Collaborate with executive-level Product, Data Science, Policy, and Operations leaders to define and prioritize the strategic machine learning roadmap, influencing product strategy and demonstrating the impact of ML on user trust and safety outcomes.

Skills

Required

  • Designing, developing, and operating large-scale, high-impact machine learning systems in a production environment (8+ years)
  • Technical leadership, management, or mentorship roles (5+ years)
  • Setting the long-term technical direction for an entire ML domain or pillar
  • Taking ambiguous problems from concept to scaled production impact
  • Deep expertise in advanced ML architectures, including Large Language Models (LLMs), transfer learning, or other foundation model technologies, especially applied to text or multimodal data.
  • Architecting scalable, real-time ML inference services and robust data pipelines operating at millions of requests per second.
  • Leading and resolving high-stakes, cross-functional conflicts and technical disagreements
  • Building consensus among diverse stakeholders
  • Exceptional product sense and strategic planning ability
  • Translating platform safety requirements into an achievable, iterative technical roadmap.

Nice to have

  • Experience managing Engineering Managers or Senior/Principal-level individual contributors
  • Experience with multimodal Generative AI
  • Experience with high-velocity text filtering
  • Experience with self-supervised learning

What the JD emphasized

  • multi-year technical vision
  • architectural strategy
  • execution roadmap
  • technical planning
  • high-priority new ML projects
  • standards of innovation
  • data quality
  • model robustness
  • ethical deployment
  • architectural development
  • massive-scale systems
  • proactively protecting our community
  • user freedom
  • platform civility
  • highest technical authority
  • long-term maintainability
  • foundational models
  • real-time inference services
  • organization-wide ML infrastructure
  • platform development
  • advanced NLP
  • self-supervised learning
  • multimodal LLMs
  • novel abuse vectors
  • proactive intervention
  • executive-level
  • strategic machine learning roadmap
  • user trust and safety outcomes
  • large-scale, high-impact machine learning systems
  • production environment
  • technical leadership
  • long-term technical direction
  • ambiguous problems
  • scaled production impact
  • Large Language Models (LLMs)
  • foundation model technologies
  • multimodal data
  • scalable, real-time ML inference services
  • robust data pipelines
  • millions of requests per second
  • high-stakes, cross-functional conflicts
  • technical disagreements
  • diverse stakeholders
  • product sense
  • strategic planning ability
  • platform safety requirements
  • iterative technical roadmap
  • complex business and safety goals
  • actionable technical strategy
  • undefined or open-ended problem spaces
  • decisive direction
  • complex technical concepts
  • non-technical executive leadership

Other signals

  • proactive moderation
  • cutting-edge ML solutions
  • massive scale
  • high-severity harms
  • real-time inference