What you'd actually do

Advance algorithms, sampling techniques and large-scale optimization to make serving and inference of generative AI models more efficient and flexible.This includes model compression, knowledge distillation and quantization strategies.

Innovate algorithms and large language model architectures that improve computation efficiency and generalization of training deep learning models.

Improve the end-to-end model deployment pipeline that includes entirely new formulations of pretraining, instruction tuning, reinforcement learning, thinking and reasoning.

Collaborate with hardware and software teams to optimize kernels and inference engines, across different hardware and model architectures.

Optimize latency, memory bandwidth, workloads.

Skills

Required

PhD degree in Computer Science, a related field, or equivalent practical experience
One of more scientific publication submissions for conferences, journals, or public repositories (such as CVPR, ICCV, NeurIPS, ICML, ICLR, etc.)

Nice to have

Experience in a university or industry labs, with primary emphasis on AI research
Understand of transformer architecture internals
Ability to drive new research ideas from problem abstraction, designing solution, experimentation, to productionisation in a rapidly shifting landscape
Excellent technical leadership and communication skills to conduct multi-team cross-function collaborations
Passion for deep/machine learning, computational statistics, and applied mathematics

What the JD emphasized

Computational Efficiency of large-scale Generative AI Models

algorithmic efficiency

model compression

inference acceleration

serving and inference

model deployment pipeline

pretraining

instruction tuning

reinforcement learning

optimize kernels and inference engines

optimization

publication submissions for conferences, journals, or public repositories

As an organization, Google maintains a portfolio of research projects driven by fundamental research, new product innovation, product contribution and infrastructure goals, while providing individuals and teams the freedom to emphasize specific types of work. As a Research Scientist, you'll setup large-scale tests and deploy promising ideas quickly and broadly, managing deadlines and deliverables while applying the latest theories to develop new and improved products, processes, or technologies. From creating experiments and prototyping implementations to designing new architectures, our research scientists work on real-world problems that span the breadth of computer science, such as machine (and deep) learning, data mining, natural language processing, hardware and software performance analysis, improving compilers for mobile platforms, as well as core search and much more.

As a Research Scientist, you'll also actively contribute to the wider research community by sharing and publishing your findings, with ideas inspired by internal projects as well as from collaborations with research programs at partner universities and technical institutes all over the world.

Google Research Singapore is the very latest addition to the Google Research presence around the globe!

As a Research Scientist, you will be making significant breakthroughs towards Computational Efficiency of large-scale Generative AI Models (LLMs, Diffusion Models, Generative Videos).

Through foundational research, the team will deliver research on algorithmic efficiency, model compression, and inference acceleration, directly impacting how next-generation AI models will be deployed to billions of people.

Google Research is building the next generation of intelligent systems for all Google products. To achieve this, we’re working on projects that utilize the latest computer science techniques developed by skilled software developers and research scientists. Google Research teams collaborate closely with other teams across Google, maintaining the flexibility and versatility required to adapt new projects and foci that meet the demands of the world's fast-paced business needs.

Responsibilities

Advance algorithms, sampling techniques and large-scale optimization to make serving and inference of generative AI models more efficient and flexible.This includes model compression, knowledge distillation and quantization strategies.
Innovate algorithms and large language model architectures that improve computation efficiency and generalization of training deep learning models.
Improve the end-to-end model deployment pipeline that includes entirely new formulations of pretraining, instruction tuning, reinforcement learning, thinking and reasoning.
Collaborate with hardware and software teams to optimize kernels and inference engines, across different hardware and model architectures.
Optimize latency, memory bandwidth, workloads.

Qualifications

Minimum qualifications:

PhD degree in Computer Science, a related field, or equivalent practical experience.
One of more scientific publication submissions for conferences, journals, or public repositories (such as CVPR, ICCV, NeurIPS, ICML, ICLR, etc.).

Preferred qualifications:

Experience in a university or industry labs, with primary emphasis on AI research.
Understand of transformer architecture internals.
Ability to drive new research ideas from problem abstraction, designing solution, experimentation, to productionisation in a rapidly shifting landscape.
Excellent technical leadership and communication skills to conduct multi-team cross-function collaborations.
Passionate for deep/machine learning, computational statistics, and applied mathematics.

Google Research Singapore is the very latest addition to the Google Research presence around the globe!

As a Research Scientist, you will be making significant breakthroughs towards Computational Efficiency of large-scale Generative AI Models (LLMs, Diffusion Models, Generative Videos).

Responsibilities

Advance algorithms, sampling techniques and large-scale optimization to make serving and inference of generative AI models more efficient and flexible.This includes model compression, knowledge distillation and quantization strategies.
Innovate algorithms and large language model architectures that improve computation efficiency and generalization of training deep learning models.
Improve the end-to-end model deployment pipeline that includes entirely new formulations of pretraining, instruction tuning, reinforcement learning, thinking and reasoning.
Collaborate with hardware and software teams to optimize kernels and inference engines, across different hardware and model architectures.
Optimize latency, memory bandwidth, workloads.

Qualifications

Minimum qualifications:

PhD degree in Computer Science, a related field, or equivalent practical experience.
One of more scientific publication submissions for conferences, journals, or public repositories (such as CVPR, ICCV, NeurIPS, ICML, ICLR, etc.).

Preferred qualifications:

Experience in a university or industry labs, with primary emphasis on AI research.
Understand of transformer architecture internals.
Ability to drive new research ideas from problem abstraction, designing solution, experimentation, to productionisation in a rapidly shifting landscape.
Excellent technical leadership and communication skills to conduct multi-team cross-function collaborations.
Passionate for deep/machine learning, computational statistics, and applied mathematics.

Research Scientist, ML Efficiency, Google Research

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: