Senior Research Engineer, On-device Inference, Robotics, Deepmind

Google Google · Big Tech · Mountain View, CA +1

Senior Research Engineer focused on optimizing Gemini Robotics models for low-latency on-device inference, driving alignment between model and hardware architectures, and influencing future model designs for resource-constrained environments.

What you'd actually do

  1. Optimize model performance for on-device use cases (memory, power, compute constrained environments).
  2. Influence future Gemini model architectures to match unique robotics use cases.
  3. Optimize agent and system-level performance (e.g., orchestration of multiple models).
  4. Drive strong alignment between model architectures and hardware architectures.
  5. Engage directly with research, software engineering, and hardware engineering teams to deliver end-to-end solutions.

Skills

Required

  • optimizing machine learning models for resource-constrained environments
  • inference for Large Language Models (LLMs)
  • Python
  • C++

Nice to have

  • core software engineering
  • building highly available systems
  • ML frameworks such as JAX, TensorFlow, or PyTorch
  • high-performance inference
  • align model architectures with AI accelerators
  • distillation
  • articulate complex technical requirements and performance tradeoffs
  • communication and collaboration skills
  • driving and influencing cross-functional teams

What the JD emphasized

  • 8 years of experience in optimizing machine learning models for resource-constrained environments.
  • low-latency on-device applications
  • low-latency inference techniques

Other signals

  • on-device inference
  • model optimization
  • robotics use cases
  • LLM inference