What you'd actually do

Ensure a safe and high quality experience for our AI products by crafting evaluation datasets, metrics, and pipelines to understand, evaluate and optimize the behavior of our models, platform and algorithms across languages, locales, and different hardware

Identify opportunities, develop strategies, and lead roadmaps to scale country and language capabilities, improve the reliability, scalability, and efficiency of model evaluation processes

Collaborate closely with AI feature teams, model developers and researchers to understand evaluation requirements, provide support, and integrate new models and use cases into the evaluation and benchmarks.

Growing the local team expertise and projects, mentoring team and removing bottlenecks. Improve team's design, code, and engineering practices

Interface with cross-functional and remote teams, engineering managers, and stakeholders to to influence cross-team priority and roadmap

Skills

Required

software development
testing and launching software products
software design and architecture
Speech/audio
reinforcement learning
ML infrastructure
ML design
ML infrastructure
model deployment
model evaluation
data processing
debugging
fine tuning

Nice to have

on-device machine learning
mobile algorithms
large Language Model/GenAI evaluations
data collection for ML
Android development
AI toolchain
evaluation metrics design
data management techniques
launching one or multiple AI/ML-powered user-facing products across countries
Android or Pixel development ecosystem

Other signals

evaluating AI features

quality evaluations

rater quality

intelligent applications

creating auto-raters

ensuring metrics consistency

establishing benchmarks

quality and performance bar for AI launches

evaluation datasets, metrics, and pipelines

evaluate and optimize the behavior of our models

integrate new models and use cases into the evaluation and benchmarks

scale country and language capabilities

improve the reliability, scalability, and efficiency of model evaluation processes

large Language Model/GenAI evaluations

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.

In this role, you will be responsible for leading a research and development team for the i18n expansion for AI features, quality evaluations and rater quality of intelligent applications on Pixel and Android, leveraging the latest on-device and server based models. This includes tasks such as creating auto-raters, ensuring metrics consistency, establishing benchmarks across the set of features and products, working with teams across Google to design and align on appropriate quality and performance bar for AI launches.

The Platforms and Devices team encompasses Google's various computing software platforms across environments (desktop, mobile, applications), as well as our first party devices and services that combine the best of Google AI, software, and hardware. Teams across this area research, design, and develop new technologies to make our user's interaction with computing faster and more seamless, building innovative experiences for our users around the world.

Responsibilities

Growing the local team expertise and projects, mentoring team and removing bottlenecks. Improve team's design, code, and engineering practices
Interface with cross-functional and remote teams, engineering managers, and stakeholders to to influence cross-team priority and roadmap
Ensure a safe and high quality experience for our AI products by crafting evaluation datasets, metrics, and pipelines to understand, evaluate and optimize the behavior of our models, platform and algorithms across languages, locales, and different hardware
Collaborate closely with AI feature teams, model developers and researchers to understand evaluation requirements, provide support, and integrate new models and use cases into the evaluation and benchmarks.
Identify opportunities, develop strategies, and lead roadmaps to scale country and language capabilities, improve the reliability, scalability, and efficiency of model evaluation processes

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.
5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
5 years of experience with ML design and ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).

Preferred qualifications:

Experience working across US and Asia.
Experience in one or more of the following areas: on-device machine learning, mobile algorithms, large Language Model/GenAI evaluations, data collection for ML, and Android development.
Experience in AI toolchain, evaluation metrics design, and data management techniques.
Experience launching one or multiple AI/ML-powered user-facing products across countries.
Knowledge of Android or Pixel development ecosystem.

Responsibilities

Growing the local team expertise and projects, mentoring team and removing bottlenecks. Improve team's design, code, and engineering practices
Interface with cross-functional and remote teams, engineering managers, and stakeholders to to influence cross-team priority and roadmap
Ensure a safe and high quality experience for our AI products by crafting evaluation datasets, metrics, and pipelines to understand, evaluate and optimize the behavior of our models, platform and algorithms across languages, locales, and different hardware
Collaborate closely with AI feature teams, model developers and researchers to understand evaluation requirements, provide support, and integrate new models and use cases into the evaluation and benchmarks.
Identify opportunities, develop strategies, and lead roadmaps to scale country and language capabilities, improve the reliability, scalability, and efficiency of model evaluation processes

Qualifications

Minimum qualifications:

Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.
5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
5 years of experience with ML design and ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).

Preferred qualifications:

Experience working across US and Asia.
Experience in one or more of the following areas: on-device machine learning, mobile algorithms, large Language Model/GenAI evaluations, data collection for ML, and Android development.
Experience in AI toolchain, evaluation metrics design, and data management techniques.
Experience launching one or multiple AI/ML-powered user-facing products across countries.
Knowledge of Android or Pixel development ecosystem.

Software Engineer, AI I18n and Evaluations

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications:

Responsibilities

Qualifications

Minimum qualifications:

Preferred qualifications: