What you'd actually do

Optimize model performance for on-device use cases (memory, power, compute constrained environments).

Influence future Gemini model architectures to match unique robotics use cases.

Optimize agent and system-level performance (e.g., orchestration of multiple models).

Drive strong alignment between model architectures and hardware architectures.

Engage directly with research, software engineering, and hardware engineering teams to deliver end-to-end solutions.

Skills

Required

optimizing machine learning models for resource-constrained environments
inference for Large Language Models (LLMs)
Python
C++

Nice to have

core software engineering
building highly available systems
ML frameworks such as JAX, TensorFlow, or PyTorch
high-performance inference
align model architectures with AI accelerators
distillation
articulate complex technical requirements and performance tradeoffs
communication and collaboration skills
driving and influencing cross-functional teams

At Google, research-focused Software Engineers are embedded throughout the company, allowing them to setup large-scale tests and deploy promising ideas quickly and broadly. Ideas may come from internal projects as well as from collaborations with research programs at partner universities and technical institutes all over the world.

From creating experiments and prototyping implementations to designing new architectures, engineers work on real-world problems including artificial intelligence, data mining, natural language processing, hardware and software performance analysis, improving compilers for mobile platforms, as well as core search and much more. But you stay connected to your research roots as an active contributor to the wider research community by partnering with universities and publishing papers.

As a Research Engineer, you will lead the critical effort to adapt and optimize Gemini Robotics models for deployment in low-latency on-device applications. You will be driving strong alignment between specialized model architectures and edge device constraints. You will be articulating technical requirements and performance tradeoffs and influencing research and engineering teams across Google to deliver robust solutions. You will be passionate about the intersection of model optimization and hardware acceleration for use cases. You will possess deep knowledge of low-latency inference techniques across GPU, TPU, and CPU architectures and are skilled in collaborating across disciplines to achieve optimal system performance.

Artificial intelligence will be one of humanity’s most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority.

We are pushing the boundaries across multiple domains. Our global teams offer learning opportunities and varied career pathways for those driven to achieve exceptional results through collective effort.

The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Optimize model performance for on-device use cases (memory, power, compute constrained environments).
Influence future Gemini model architectures to match unique robotics use cases.
Optimize agent and system-level performance (e.g., orchestration of multiple models).
Drive strong alignment between model architectures and hardware architectures.
Engage directly with research, software engineering, and hardware engineering teams to deliver end-to-end solutions.

Qualifications

Minimum qualifications:

Bachelor's degree in Computer Science, Electrical Engineering, or a related field or equivalent practical experience.
8 years of experience in optimizing machine learning models for resource-constrained environments.
Experience in inference for Large Language Models (LLMs), including architectures like Mixture of Experts or diffusion models.

Preferred qualifications:

Experience with core software engineering and building highly available systems.
Experience with ML frameworks such as JAX, TensorFlow, or PyTorch, particularly focused on high-performance inference.
Understanding of techniques to align model architectures with AI accelerators (e.g., distillation).
Ability to articulate complex technical requirements and performance tradeoffs to engineering teams.
Excellent programming skills in Python and C++.
Excellent communication and collaboration skills, with a track record of driving and influencing cross-functional teams.

Responsibilities

Optimize model performance for on-device use cases (memory, power, compute constrained environments).
Influence future Gemini model architectures to match unique robotics use cases.
Optimize agent and system-level performance (e.g., orchestration of multiple models).
Drive strong alignment between model architectures and hardware architectures.
Engage directly with research, software engineering, and hardware engineering teams to deliver end-to-end solutions.

Qualifications

Minimum qualifications:

Bachelor's degree in Computer Science, Electrical Engineering, or a related field or equivalent practical experience.
8 years of experience in optimizing machine learning models for resource-constrained environments.
Experience in inference for Large Language Models (LLMs), including architectures like Mixture of Experts or diffusion models.

Preferred qualifications:

Experience with core software engineering and building highly available systems.
Experience with ML frameworks such as JAX, TensorFlow, or PyTorch, particularly focused on high-performance inference.
Understanding of techniques to align model architectures with AI accelerators (e.g., distillation).
Ability to articulate complex technical requirements and performance tradeoffs to engineering teams.
Excellent programming skills in Python and C++.
Excellent communication and collaboration skills, with a track record of driving and influencing cross-functional teams.

Senior Research Engineer, On-device Inference, Robotics, Deepmind

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: