[2026] Senior Machine Learning Engineer (systems), Embodied Ai/npcs, ML Platform - Phd Early Career

Roblox Roblox · Consumer · San Mateo, CA · Early Career Full-Time

Roblox is seeking a Senior Machine Learning Engineer to work on their Embodied AI/NPCs and ML Platform teams. The role involves developing scale data pipelines, training novel deep learning architectures for NPCs, and optimizing real-time inference for autonomous NPCs. Additionally, the role will pioneer next-generation AI tooling, build core platform components (Serving Layer, Model Registry, Pipeline Orchestrator), and design developer experiences for ML@Roblox. The position also includes architecting and implementing scalable distributed inference systems for LLMs and large recommender models, optimizing inference engines for massive scale and low latency, and conducting low-level performance analysis on GPU architectures. The ideal candidate will have a PhD, experience with end-to-end ML pipelines, real-world agentic applications, and scaling high-performance architectures.

What you'd actually do

  1. Develop Scale Data Pipelines: Design, build and maintain robust data pipelines to collect complex 3D game states and real-time player actions across the platform.
  2. Train Novel Architectures: Solve the feature extraction across games for NPC model in a general and scalable way and drive model training speed for novel, sophisticated deep learning architectures.
  3. Optimize Real-Time Inference: Engineer high-performance model inference solutions to support the seamless deployment of 10s to 100s of autonomous NPCs in real-time environments.
  4. Pioneer next-generation AI tooling to enhance the efficiency, cost, and usability of ML@Roblox.
  5. Build and maintain core platform components: Serving Layer, Model Registry, Pipeline Orchestrator, and Training/Inference control planes.
  6. Design great developer experiences (paved-road templates, tooling, visualizations) to reduce time-to-production and ensure foundational AI systems are scalable and reliable.
  7. Architect and implement scalable distributed inference systems for efficiently serving LLMs and Large Recommender Models at massive scale.
  8. Optimize our inference engine to serve millions of QPS at low latency.
  9. Conduct deep, low-level performance analysis and optimize ML models (using techniques like continuous batching, speculative decoding, and quantization) and systems on GPU architectures to maintain peak performance and stability.

Skills

Required

  • Ph.D. in Computer Science, Computer Engineering, Mathematics, Statistics, or a related technical field
  • Built end-to-end ML pipelines
  • Managed model inference and deployment
  • Experience with novel datasets
  • Building real-world agentic applications
  • Scaled high-performance, high-availability architectures
  • Infrastructure using Kubernetes
  • Major cloud providers (AWS, Azure, or GCP)

Nice to have

  • thesis aligned to Roblox’s research areas
  • continuous batching
  • speculative decoding
  • quantization

What the JD emphasized

  • Ph.D. in Computer Science, Computer Engineering, Mathematics, Statistics, or a related technical field, with a thesis aligned to Roblox’s research areas
  • Built end-to-end ML pipelines and managed model inference and deployment
  • Experience with novel datasets, and building real-world agentic applications
  • Scaled high-performance, high-availability architectures
  • real-time inference
  • massive scale
  • low latency
  • GPU architectures

Other signals

  • building cutting-edge systems that power AI
  • NPC system that can play any Roblox game
  • real-time inference efficiently enough to support deployment to all Roblox players
  • 3D foundational models
  • democratizing creation by making it simple for anyone to generate high-quality, immersive 3D experiences using AI
  • supporting hundreds of ML use cases and billions of inferences daily
  • AI Platform, Distributed Inference Systems
  • Pioneer next-generation AI tooling
  • Build and maintain core platform components: Serving Layer, Model Registry, Pipeline Orchestrator, and Training/Inference control planes
  • Design great developer experiences
  • Architect and implement scalable distributed inference systems for efficiently serving LLMs and Large Recommender Models at massive scale
  • Optimize our inference engine to serve millions of QPS at low latency
  • Conduct deep, low-level performance analysis and optimize ML models (using techniques like continuous batching, speculative decoding, and quantization) and systems on GPU architectures to maintain peak performance and stability
  • Built end-to-end ML pipelines and managed model inference and deployment
  • Experience with novel datasets, and building real-world agentic applications
  • Scaled high-performance, high-availability architectures