What you'd actually do

Guide and co-guide research projects exploring emerging mechanistic interpretability methods, including dictionary learning architectures (e.g., multitoken transcoders, Matryoshka sparse autoencoders), patchscopes, and agentic interpretability.

Design, develop, and maintain open-source infrastructure and evaluation suites (similar to SAEBench or the dictionary_learning library) to accelerate community and internal research.

Perform causal validation of discovered features and circuits using activation patching and feature steering to mitigate undesired behaviors like hallucinations or hidden objectives.

Write and present papers for machine learning conferences (e.g., NeurIPS, ICML) and author technical blog posts to communicate concepts to the broader artificial intelligence safety community.

Act as both a scientist and an engineer, writing code to run experiments on distributed compute clusters.

Skills

Required

PhD in Computer Science, a related field, or equivalent practical experience.
Experience building machine learning solutions, utilizing various machine learning architectures (e.g., deep learning, LSTMs, convolutional networks) and open-source frameworks (e.g., TensorFlow, PyTorch).
Experience in Python programming.
One or more scientific publication submissions for conferences, journals, or public repositories (e.g., CVPR, ICCV, NeurIPS, ICML, ICLR).

Nice to have

2 years of coding experience.
1 year of experience managing and initiating research agendas.
Experience designing multi-modal, self-supervised pre-training tasks (e.g., contrastive learning, masked autoencoders) to improve data efficiency and manage sparse signals.

As an organization, Google maintains a portfolio of research projects driven by fundamental research, new product innovation, product contribution and infrastructure goals, while providing individuals and teams the freedom to emphasize specific types of work. As a Research Scientist, you'll setup large-scale tests and deploy promising ideas quickly and broadly, managing deadlines and deliverables while applying the latest theories to develop new and improved products, processes, or technologies. From creating experiments and prototyping implementations to designing new architectures, our research scientists work on real-world problems that span the breadth of computer science, such as machine (and deep) learning, data mining, natural language processing, hardware and software performance analysis, improving compilers for mobile platforms, as well as core search and much more.

As a Research Scientist, you'll also actively contribute to the wider research community by sharing and publishing your findings, with ideas inspired by internal projects as well as from collaborations with research programs at partner universities and technical institutes all over the world.

Our special projects team within Google Tech and Society is dedicated to mechanistic artificial intelligence research. We are a collaborative group with interconnections to research and development teams throughout the company. Our focus is on the basic science of mechanistic interpretability, striving to reverse-engineer the internal computations of large language models to ensure their safety, alignment, and reliability. We push beyond traditional approaches to understand the compositional and structural mechanisms within models.

The Technology & Society organization connects research, people, and ideas across Google and Alphabet to help shape and advance our most ambitious technology innovations and initiatives and their impact on users and society for the better, and responsibly. In addition, we also aim to share perspectives, engage, and collaborate with others externally on technology related issues and opportunities for society.

The US base salary range for this full-time position is $147,000-$211,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Guide and co-guide research projects exploring emerging mechanistic interpretability methods, including dictionary learning architectures (e.g., multitoken transcoders, Matryoshka sparse autoencoders), patchscopes, and agentic interpretability.
Design, develop, and maintain open-source infrastructure and evaluation suites (similar to SAEBench or the dictionary_learning library) to accelerate community and internal research.
Perform causal validation of discovered features and circuits using activation patching and feature steering to mitigate undesired behaviors like hallucinations or hidden objectives.
Write and present papers for machine learning conferences (e.g., NeurIPS, ICML) and author technical blog posts to communicate concepts to the broader artificial intelligence safety community.
Act as both a scientist and an engineer, writing code to run experiments on distributed compute clusters.

Qualifications

Minimum qualifications:

PhD in Computer Science, a related field, or equivalent practical experience.
Experience building machine learning solutions, utilizing various machine learning architectures (e.g., deep learning, LSTMs, convolutional networks) and open-source frameworks (e.g., TensorFlow, PyTorch).
Experience in Python programming.
One or more scientific publication submissions for conferences, journals, or public repositories (e.g., CVPR, ICCV, NeurIPS, ICML, ICLR).

Preferred qualifications:

2 years of coding experience.
1 year of experience managing and initiating research agendas.
Experience designing multi-modal, self-supervised pre-training tasks (e.g., contrastive learning, masked autoencoders) to improve data efficiency and manage sparse signals.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Guide and co-guide research projects exploring emerging mechanistic interpretability methods, including dictionary learning architectures (e.g., multitoken transcoders, Matryoshka sparse autoencoders), patchscopes, and agentic interpretability.
Design, develop, and maintain open-source infrastructure and evaluation suites (similar to SAEBench or the dictionary_learning library) to accelerate community and internal research.
Perform causal validation of discovered features and circuits using activation patching and feature steering to mitigate undesired behaviors like hallucinations or hidden objectives.
Write and present papers for machine learning conferences (e.g., NeurIPS, ICML) and author technical blog posts to communicate concepts to the broader artificial intelligence safety community.
Act as both a scientist and an engineer, writing code to run experiments on distributed compute clusters.

Qualifications

Minimum qualifications:

PhD in Computer Science, a related field, or equivalent practical experience.
Experience building machine learning solutions, utilizing various machine learning architectures (e.g., deep learning, LSTMs, convolutional networks) and open-source frameworks (e.g., TensorFlow, PyTorch).
Experience in Python programming.
One or more scientific publication submissions for conferences, journals, or public repositories (e.g., CVPR, ICCV, NeurIPS, ICML, ICLR).

Preferred qualifications:

2 years of coding experience.
1 year of experience managing and initiating research agendas.
Experience designing multi-modal, self-supervised pre-training tasks (e.g., contrastive learning, masked autoencoders) to improve data efficiency and manage sparse signals.

Research Scientist, Mechanistic Interpretability, Special Projects

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: