What you'd actually do

Design, develop, and maintain high-performance, product-centric data pipelines using Airflow, DBT, and Python.

Architect and optimize the massive-scale data warehouse and lakehouse that serves as our single source of truth for all customer data, primarily using Snowflake.

Lead the integration of diverse structured and unstructured data sources (e.g., web data, third-party APIs) into our data ecosystem, ensuring high-quality and reliable ingestion.

Implement and enforce Model Context Protocol (MCP) or similar architectures to feed accurate and contextual data into our LLM-powered products for applications like Retrieval Augmented Generation (RAG) and advanced search.

Define, monitor, and enforce data quality SLAs across all pipelines and products, ensuring data accuracy and lineage are a top priority.

Skills

Required

Expert-level SQL
Strong Python programming skills
Production-level experience for large-scale batch and streaming data processing
DBT (Data Build Tool)
Snowflake data warehouse design, optimization, and cost modeling
Model Context Protocol (MCP) or similar architectures
Data lakes, event-driven architectures (e.g., Kafka), ETL/ELT, and data mesh
Cloud platforms (GCP and/or AWS)
Infrastructure as code (e.g., Terraform)
Excellent communication skills
Strategic & Product-Oriented Thinking
Leadership & Mentorship
Stakeholder Management
Agility & Adaptability
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
8+ years of progressive experience in data engineering

Nice to have

LLMOps
LangChain
RAG (Retrieval Augmented Generation) pipelines
building embedding models or pipelines for Named Entity Recognition (NER)
data cataloging tools (e.g., OpenLIneage, etc.) and lineage tracking
other distributed systems and databases (e.g., DynamoDB, Flink)

Other signals

Designing and expanding enterprise-level data infrastructure that enables ZoomInfo's internal teams to interact with data comprehensively

Integrating vast, diverse data sources into our AI applications, including our industry-leading LLM-powered systems

ZoomInfo is where careers accelerate. We move fast, think boldly, and empower you to do the best work of your life. You’ll be surrounded by teammates who care deeply, challenge each other, and celebrate wins. With tools that amplify your impact and a culture that backs your ambition, you won’t just contribute. You’ll make things happen–fast.

About the Role

We are looking for a highly skilled Senior Data Engineer to become part of our core Data & AI Engineering team. In this pivotal role, you will be responsible for designing and expanding enterprise-level data infrastructure that enables ZoomInfo's internal teams to interact with data comprehensively—extracting, exploring, analyzing, and generating insights—through various platforms using ZI's internal chat agent

The ideal candidate has a strong background in big data processing, pipeline orchestration, and data modeling, with a proven track record of delivering scalable and high-quality data solutions in fast-paced, data-centric product environments. Given the dynamic nature of emerging technologies, this role requires an individual who excels at exploration and embraces continuous learning as core responsibilities. You'll constantly research and implement innovative solutions while integrating vast, diverse data sources into our AI applications, including our industry-leading LLM-powered systems

What you’ll do:

Design, develop, and maintain high-performance, product-centric data pipelines using Airflow, DBT, and Python.
Architect and optimize the massive-scale data warehouse and lakehouse that serves as our single source of truth for all customer data, primarily using Snowflake.
Lead the integration of diverse structured and unstructured data sources (e.g., web data, third-party APIs) into our data ecosystem, ensuring high-quality and reliable ingestion.
Implement and enforce Model Context Protocol (MCP) or similar architectures to feed accurate and contextual data into our LLM-powered products for applications like Retrieval Augmented Generation (RAG) and advanced search.
Collaborate with ML engineers, data scientists, and product managers to translate business needs into scalable data solutions that directly enhance customer value.
Define, monitor, and enforce data quality SLAs across all pipelines and products, ensuring data accuracy and lineage are a top priority.
Mentor and coach junior engineers, promoting best practices in code quality, data architecture, and operational excellence.
Participate in architectural decisions and long-term strategy planning for our enterprise-wide data infrastructure, with a focus on cost, performance, and reliability.

What you bring:

Expert-level SQL for building performant, scalable queries and transformations on massive datasets.
Strong Python programming skills with a focus on distributed computing, data manipulation, and building robust APIs.
**Production-level experience **for large-scale batch and streaming data processing.
Hands-on experience with DBT (Data Build Tool) for advanced data modeling and transformations in a modern data stack.
Deep knowledge of Snowflake data warehouse design, optimization, and cost modeling.
Experience implementing Model Context Protocol (MCP) or similar architectures to feed structured and unstructured data into LLM-powered systems.
Strong understanding of data architecture concepts, including data lakes, event-driven architectures (e.g., Kafka), ETL/ELT, and data mesh.
Proficiency with cloud platforms (GCP and/or AWS) and infrastructure as code (e.g., Terraform).

Nice to Have

Familiarity with LLMOps, LangChain, or RAG (Retrieval Augmented Generation) pipelines.
Experience with building embedding models or pipelines for Named Entity Recognition (NER).
Knowledge of data cataloging tools (e.g., OpenLIneage, etc.) and lineage tracking.
Familiarity with other distributed systems and databases (e.g., DynamoDB, Flink).

Required Non-Technical Skills

Excellent communication skills – ability to explain complex technical concepts to both engineering teams and non-technical stakeholders.
Strategic & Product-Oriented Thinking – can translate business objectives and customer needs into scalable, high-impact data solutions.
Leadership & Mentorship – experience guiding and uplifting engineering teams to achieve their full potential.
Stakeholder Management – able to collaborate effectively across departments (Product, Engineering, Sales, Compliance).
Agility & Adaptability – thrives in ambiguous, evolving environments and can rapidly prototype and iterate on solutions.
Strong documentation habits and ability to evangelize best practices across the organization.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
8+ years of progressive experience in data engineering, with a track record of leadership and impact.

Demonstrated experience in implementing or scaling data infrastructure for a data-centric product company.

#LI-AR2

#LI-REMOTE

About us:

ZoomInfo (NASDAQ: GTM) is the Go-To-Market Intelligence Platform that empowers businesses to grow faster with AI-ready insights, trusted data, and advanced automation. Its solutions provide more than 35,000 companies worldwide with a complete view of their customers, making every seller their best seller.

ZoomInfo is committed to protecting your privacy when you apply for jobs with us. Please review our Job Applicant Privacy Notice for more details on how we handle your personal information.

ZoomInfo may use a software-based assessment as part of the recruitment process. More information about this tool, including the results of the most recent bias audit, is available here.

ZoomInfo is proud to be an equal opportunity employer, hiring based on qualifications, merit, and business needs, and does not discriminate based on protected status. We welcome all applicants and are committed to providing equal employment opportunities regardless of sex, race, age, color, national origin, sexual orientation, gender identity, marital status, disability status, religion, protected military or veteran status, medical condition, or any other characteristic protected by applicable law. We also consider qualified candidates with criminal histories in accordance with legal requirements.

For Massachusetts Applicants: It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. ZoomInfo does not administer lie detector tests to applicants in any location.

About the Role

What you’ll do:

Design, develop, and maintain high-performance, product-centric data pipelines using Airflow, DBT, and Python.
Architect and optimize the massive-scale data warehouse and lakehouse that serves as our single source of truth for all customer data, primarily using Snowflake.
Lead the integration of diverse structured and unstructured data sources (e.g., web data, third-party APIs) into our data ecosystem, ensuring high-quality and reliable ingestion.
Implement and enforce Model Context Protocol (MCP) or similar architectures to feed accurate and contextual data into our LLM-powered products for applications like Retrieval Augmented Generation (RAG) and advanced search.
Collaborate with ML engineers, data scientists, and product managers to translate business needs into scalable data solutions that directly enhance customer value.
Define, monitor, and enforce data quality SLAs across all pipelines and products, ensuring data accuracy and lineage are a top priority.
Mentor and coach junior engineers, promoting best practices in code quality, data architecture, and operational excellence.
Participate in architectural decisions and long-term strategy planning for our enterprise-wide data infrastructure, with a focus on cost, performance, and reliability.

What you bring:

Expert-level SQL for building performant, scalable queries and transformations on massive datasets.
Strong Python programming skills with a focus on distributed computing, data manipulation, and building robust APIs.
**Production-level experience **for large-scale batch and streaming data processing.
Hands-on experience with DBT (Data Build Tool) for advanced data modeling and transformations in a modern data stack.
Deep knowledge of Snowflake data warehouse design, optimization, and cost modeling.
Experience implementing Model Context Protocol (MCP) or similar architectures to feed structured and unstructured data into LLM-powered systems.
Strong understanding of data architecture concepts, including data lakes, event-driven architectures (e.g., Kafka), ETL/ELT, and data mesh.
Proficiency with cloud platforms (GCP and/or AWS) and infrastructure as code (e.g., Terraform).

Nice to Have

Familiarity with LLMOps, LangChain, or RAG (Retrieval Augmented Generation) pipelines.
Experience with building embedding models or pipelines for Named Entity Recognition (NER).
Knowledge of data cataloging tools (e.g., OpenLIneage, etc.) and lineage tracking.
Familiarity with other distributed systems and databases (e.g., DynamoDB, Flink).

Required Non-Technical Skills

Excellent communication skills – ability to explain complex technical concepts to both engineering teams and non-technical stakeholders.
Strategic & Product-Oriented Thinking – can translate business objectives and customer needs into scalable, high-impact data solutions.
Leadership & Mentorship – experience guiding and uplifting engineering teams to achieve their full potential.
Stakeholder Management – able to collaborate effectively across departments (Product, Engineering, Sales, Compliance).
Agility & Adaptability – thrives in ambiguous, evolving environments and can rapidly prototype and iterate on solutions.
Strong documentation habits and ability to evangelize best practices across the organization.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
8+ years of progressive experience in data engineering, with a track record of leadership and impact.

Demonstrated experience in implementing or scaling data infrastructure for a data-centric product company.

#LI-AR2

#LI-REMOTE

About us:

ZoomInfo is committed to protecting your privacy when you apply for jobs with us. Please review our Job Applicant Privacy Notice for more details on how we handle your personal information.

ZoomInfo may use a software-based assessment as part of the recruitment process. More information about this tool, including the results of the most recent bias audit, is available here.

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About the Role

What you’ll do:

What you bring:

Nice to Have

Required Non-Technical Skills

Qualifications

About the Role

What you’ll do:

What you bring:

Nice to Have

Required Non-Technical Skills

Qualifications