Senior Research Engineer, On-device Inference, Robotics, Deepmind

Google Google · Big Tech · Mountain View, CA +1

Senior Research Engineer focused on optimizing Gemini Robotics models for low-latency on-device inference, driving alignment between model architectures and edge device constraints, and influencing research and engineering teams for robust solutions. Requires deep knowledge of inference techniques across GPU, TPU, and CPU architectures.

What you'd actually do

  1. Optimize model performance for on-device use cases (memory, power, compute constrained environments).
  2. Influence future Gemini model architectures to match unique robotics use cases.
  3. Optimize agent and system-level performance (e.g., orchestration of multiple models).
  4. Drive strong alignment between model architectures and hardware architectures.
  5. Engage directly with research, software engineering, and hardware engineering teams to deliver end-to-end solutions.

Skills

Required

  • optimizing machine learning models for resource-constrained environments
  • inference for Large Language Models (LLMs)
  • Python
  • C++

Nice to have

  • core software engineering
  • building highly available systems
  • ML frameworks such as JAX, TensorFlow, or PyTorch
  • high-performance inference
  • align model architectures with AI accelerators
  • distillation
  • articulate complex technical requirements and performance tradeoffs
  • communication and collaboration skills
  • driving and influencing cross-functional teams

What the JD emphasized

  • 8 years of experience in optimizing machine learning models for resource-constrained environments.
  • low-latency on-device applications
  • low-latency inference techniques

Other signals

  • on-device inference
  • robotics
  • model optimization
  • edge devices
  • LLM inference