[2026] Senior Machine Learning Engineer, Multimodal Ai, Computer Vision and Graphics - Phd Early Career

Roblox Roblox · Consumer · San Mateo, CA · Early Career Full-Time

Roblox is seeking a Senior Machine Learning Engineer with a PhD to work on multimodal AI, focusing on computer vision and graphics. The role involves building and deploying foundation models for content understanding, creation, search, recommendations, and safety systems. This includes developing models for visual and 3D content, facial age estimation, and fraud detection, with a strong emphasis on applied research and production impact.

What you'd actually do

  1. Design and implement foundation models for visual and 3D-based creation, search, and recommendations, ensuring a high level of fidelity, relevance, and ranking.
  2. Break down complex product requirements into iterative deliverable stages, moving applied research into high-scale production systems.
  3. Implement innovative visual and multi-modal models that power core Roblox functions (e.g., world creation, avatar systems, search, and recommendations).
  4. Build high precision facial age estimation across demographics from ground up including various fraud detection techniques for a robust and safe user identity system.

Skills

Required

  • PhD in computer science, engineering, or a related field
  • Expertise in computer vision, multimodal learning, 3D Graphics, or large-scale representation learning
  • Experience developing and training deep learning models using modern frameworks (PyTorch, TensorFlow, JAX)
  • Proficiency in Python, C++, Go, or Java
  • Experience building and optimizing large-scale systems

Nice to have

  • multimodal AI
  • computer vision
  • graphics
  • generative modeling

What the JD emphasized

  • PhD
  • thesis aligned to Roblox’s research areas
  • strong research track record
  • multiple publications and presentations in top-tier, peer-reviewed venues

Other signals

  • building cutting-edge models
  • applied research and engineering projects with direct production impact
  • high-scale production systems
  • multimodal models
  • computer vision
  • 3D avatars