Senior Data Engineer, Gtm

Google · Big Tech · Mountain View, CA +3

Senior Data Engineer role focused on building data pipelines and infrastructure to process unstructured customer feedback data, integrating AI agents and LLMs for insights and routing. The role involves NLP, embedding workflows, and MLOps/LLMOps principles.

What you'd actually do

Design and maintain pipelines to ingest, clean, and process massive volumes of unstructured data, including business transcripts and support cases, into reliable analytical datasets.
Architect and deploy advanced platforms and tooling that empower the team to leverage autonomous AI agents and Large Language Models (LLMs) for intelligent routing and automated insights.
Develop internal libraries and self-serve frameworks that streamline Natural Language Processing (NLP) and causal analysis, significantly reducing operational friction and enhancing team productivity.
Manage and optimize embedding workflows using TensorFlow and Tensor Processing Units (TPUs), ensuring efficient processing that bypasses standard API constraints for high-volume data.
Implement automated monitoring, alerting, and rigorous data quality checks to guarantee the security, reliability, and governance of high-stakes analytical assets.

Skills

Required

Python
SQL
MLOps
LLMOps
data infrastructure
text processing pipelines
embedding pipelines
data pipelines
data schemas
unstructured text data
machine learning workflows

Nice to have

data schemas
google colaboratory (Colab)
TensorFlow
Tensor Processing Units (TPUs)
agentic tools and platforms
LLM orchestration
agentic infrastructure
Python
SQL
MLOps
LLMOps

What the JD emphasized

massive volumes of unstructured conversational data
transforming massive volumes of unstructured conversational data
customer feedback
customer feedback and product decisions
accelerating the Ads product adoption flywheel
shaping Go-to-Market (GTM) strategy
own and architect the foundational infrastructure
transforms unstructured customer feedback
quantified strategic assets
scalable, automated pipelines
integrate sales transcripts
Business Intelligence (BI)
pioneer our transition
flexible workflows
core infrastructure and platforms
multiply our data science team's capacity, agility, and impact
end-to-end delivery
production-ready solutions
ingest, clean, and process massive volumes of unstructured data
business transcripts and support cases
reliable analytical datasets
Architect and deploy advanced platforms and tooling
leverage autonomous AI agents and Large Language Models (LLMs)
intelligent routing and automated insights
Develop internal libraries and self-serve frameworks
streamline Natural Language Processing (NLP) and causal analysis
significantly reducing operational friction
enhancing team productivity
Manage and optimize embedding workflows
TensorFlow and Tensor Processing Units (TPUs)
efficient processing
bypasses standard API constraints
high-volume data
Implement automated monitoring, alerting, and rigorous data quality checks
guarantee the security, reliability, and governance
high-stakes analytical assets
5 years of experience coding in Python and SQL
5 years of experience working with machine learning operations (MLOps) and large language model operations (LLMOps) principles and data infrastructure
deploying text processing and embedding pipelines
5 years of experience designing and deploying data pipelines
managing data schemas
processing unstructured text data
machine learning (ML) workflows
Experience with LLM orchestration and agentic infrastructure

Other signals

transforming massive volumes of unstructured conversational data into quantified, trusted insights
architect the foundational infrastructure that transforms unstructured customer feedback into quantified strategic assets
scalable, automated pipelines that integrate sales transcripts with critical Business Intelligence (BI)
pioneer our transition towards more flexible workflows, developing the core infrastructure and platforms that multiply our data science team's capacity, agility, and impact through end-to-end delivery of production-ready solutions
leverage autonomous AI agents and Large Language Models (LLMs) for intelligent routing and automated insights
streamline Natural Language Processing (NLP) and causal analysis
Manage and optimize embedding workflows using TensorFlow and Tensor Processing Units (TPUs)
Implement automated monitoring, alerting, and rigorous data quality checks

Read full job description

gTech’s Product and Tools Operations team (gPTO) leverages deep user, operational, and technical insights to innovate Google's Ads products into customer experiences that are so intuitive (or automated) that they require no support at all. gPTO partners closely with gTech’s Support, Professional Services, Product Management, and Engineering teams to innovate and simplify our Ads products and build the productivity tools ecosystem for gTech users.

As a part of the Go-to-Market (GTM), you will serve as the intelligence partner for product teams, transforming massive volumes of unstructured conversational data into quantified, trusted insights that bridge the gap between customer feedback and product decisions, as this is a high-visibility initiative critical for accelerating the Ads product adoption flywheel and shaping Go-to-Market (GTM) strategy for priority products.

As a Senior Data Engineer, you will own and architect the foundational infrastructure that transforms unstructured customer feedback into quantified strategic assets. You will help us move towards scalable, automated pipelines that integrate sales transcripts with critical Business Intelligence (BI). You will pioneer our transition towards more flexible workflows, developing the core infrastructure and platforms that multiply our data science team's capacity, agility, and impact through end-to-end delivery of production-ready solutions.Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $156000 - $227000 (USD) + 15% bonus target + bonus + equity + benefits

Learn more about benefits at Google.

Responsibilities

Design and maintain pipelines to ingest, clean, and process massive volumes of unstructured data, including business transcripts and support cases, into reliable analytical datasets.
Architect and deploy advanced platforms and tooling that empower the team to leverage autonomous AI agents and Large Language Models (LLMs) for intelligent routing and automated insights.
Develop internal libraries and self-serve frameworks that streamline Natural Language Processing (NLP) and causal analysis, significantly reducing operational friction and enhancing team productivity.
Manage and optimize embedding workflows using TensorFlow and Tensor Processing Units (TPUs), ensuring efficient processing that bypasses standard API constraints for high-volume data.
Implement automated monitoring, alerting, and rigorous data quality checks to guarantee the security, reliability, and governance of high-stakes analytical assets.

Qualifications

Minimum qualifications:

Bachelor's degree or equivalent practical experience.
5 years of experience coding in Python and SQL.
5 years of experience working with machine learning operations (MLOps) and large language model operations (LLMOps) principles and data infrastructure, including deploying text processing and embedding pipelines.
5 years of experience designing and deploying data pipelines, including managing data schemas and processing unstructured text data for machine learning (ML) workflows.

Preferred qualifications:

Experience with data schemas.
Experience with google colaboratory (Colab), TensorFlow, Tensor Processing Units (TPUs), and agentic tools and platforms for processing unstructured text data.
Experience with LLM orchestration and agentic infrastructure.
Proficiency in SQL and Python.
Understanding of MLOps/LLMOps principles to ensure the scalable and reliable deployment of text processing and embedding pipelines.