What you'd actually do

Rapidly developing next-generation scalable, flexible, and high-performance data pipelines.

Dealing with massive textual sources to train GenAI foundation models.

Solving issues with data and data pipelines, prioritizing based on customer impact.

End-to-end ownership of data quality in our core datasets and data pipelines.

Experimenting with new tools and technologies to meet business requirements regarding performance, scaling, and data quality.

Skills

Required

Python
Java
Pyspark
Apache Flink
Snowflake
MySQL
Cassandra
DynamoDB
Data Warehousing
ETL/ELT pipelines
production data pipelines
data-lake
server-less solutions
schema design
data modeling

Nice to have

experience in data processing for large-scale language models like GPT, BERT, or similar architectures
NumPy
pandas
matplotlib
experimental design
A/B testing
evaluation metrics for ML models
working on products that impact a large customer base

What the JD emphasized

Minimum of 3 years of experience as a Data Engineer or a similar role, with a consistent record of successfully delivering ML/Data solutions.

You have built production data pipelines in the cloud, setting up data-lake and server-less solutions; ‌ you have hands-on experience with schema design and data modeling and working with ML scientists and ML engineers to provide production level ML solutions.

About Us: At Booking.com, data drives our decisions. Technology is at our core. And innovation is everywhere. But our company is more than datasets, lines of code or A/B tests. We’re the thrill of the first night in a new place. The excitement of the next morning. The friends you encounter. The journeys you take. The sights you see. And the memories you make. Through our products, partners and people, we make it easier for everyone to experience the world.

Leadership/Team Quote:

This opening is for the Content Intelligence team within the Marketplace AI department at Booking.com.

The Content Intelligence team is at the forefront of Generative AI innovation, driving solutions for travel-related chatbots, text generation and summarization applications, Q&A systems, and free-text search. Beyond this, the team is building a cutting-edge platform that processes millions of images and textual inputs daily, enriching them with ML capabilities. These enriched datasets power downstream applications, helping personalize the customer experience—for example, selecting and displaying the most relevant images and reviews as customers plan and book their next vacation.

Role Description:

As a Data Engineer, you’ll collaborate with top-notch engineers and data scientists to elevate our platform to the next level and deliver exceptional user experiences. Your primary focus will be on the data engineering aspects—ensuring the seamless flow of high-quality, relevant data to train and optimize content models, including GenAI foundation models, supervised fine-tuning, and more.

You’ll work closely with teams across the company to ensure the availability of high-quality data from ML platforms, powering decisions across all departments. With access to petabytes of data through MySQL, Snowflake, Cassandra, S3, and other platforms, your challenge will be to ensure that this data is applied even more effectively to support business decisions, train and monitor ML models and improve our products.

**Key Job Responsibilities and Duties: **

Rapidly developing next-generation scalable, flexible, and high-performance data pipelines.
Dealing with massive textual sources to train GenAI foundation models.
Solving issues with data and data pipelines, prioritizing based on customer impact.
End-to-end ownership of data quality in our core datasets and data pipelines.
Experimenting with new tools and technologies to meet business requirements regarding performance, scaling, and data quality.
Providing tools that improve Data Quality company-wide, specifically for ML scientists.
Providing self-organizing tools that help the analytics community discover data, assess quality, explore usage, and find peers with relevant expertise.
Acting as an intermediary for problems, with both technical and non-technical audiences.
Promote and drive impactful and innovative engineering solutions
Technical, behavioral and interpersonal competence advancement via on-the-job opportunities, experimental projects, hackathons, conferences, and active community participation
Collaborate with multidisciplinary teams: Collaborate with product managers, data scientists, and analysts to understand business requirements and translate them into machine learning solutions. Provide technical guidance and mentorship to junior team members.

Qualifications & Skills:

Bachelor’s or master’s degree in computer science, Engineering, Statistics, or a related field.
Minimum of 3 years of experience as a Data Engineer or a similar role, with a consistent record of successfully delivering ML/Data solutions.
You have built production data pipelines in the cloud, setting up data-lake and server-less solutions; ‌ you have hands-on experience with schema design and data modeling and working with ML scientists and ML engineers to provide production level ML solutions.
You have experience designing systems E2E and knowledge of basic concepts (lb, db, caching, NoSQL, etc)
Strong programming skills in languages such as Python and Java.
Experience with big data processing frameworks such, Pyspark, Apache Flink, Snowflake or similar frameworks.
Demonstrable experience with MySQL, Cassandra, DynamoDB or similar relational/NoSQL database systems.
Experience with Data Warehousing and ETL/ELT pipelines
Experience in data processing for large-scale language models like GPT, BERT, or similar architectures - an advantage.
Proficiency in data manipulation, analysis, and visualization using tools like NumPy, pandas, and matplotlib - an advantage.
Experience with experimental design, A/B testing, and evaluation metrics for ML models - an advantage.
Experience of working on products that impact a large customer base - an advantage.
Excellent communication in English; written and spoken.

Benefits & Perks - Global Impact, Personal Relevance:

Booking.com’s Total Rewards Philosophy is not only about compensation but also about benefits. We offer a competitive compensation and benefits package, as well unique-to-Booking.com benefits which include:

Annual paid time off and generous paid leave scheme including: parent, grandparent, bereavement, and care leave
Hybrid working including flexible working arrangements, and up to 20 days per year working from abroad (home country)
Industry leading product discounts - up to 1400 per year - for yourself, including automatic Genius Level 3 status and Booking.com wallet credit

**Diversity, Equity and Inclusion (DEI) at Booking.com: **

Diversity, Equity & Inclusion have been a core part of our company culture since day one. This ongoing journey starts with our very own employees, who represent over 140 nationalities and a wide range of ethnic and social backgrounds, genders and sexual orientations.

Take it from our Chief People Officer, Paulo Pisano: “At Booking.com, the diversity of our people doesn’t just build an outstanding workplace, it also creates a better and more inclusive travel experience for everyone. Inclusion is at the heart of everything we do. It’s a place where you can make your mark and have a real impact in travel and tech.”

We ensure that colleagues with disabilities are provided the adjustments and tools they need to participate in the job application and interview process, to perform crucial job functions, and to receive other benefits and privileges of employment.

**Application Process: **

Let’s go places together: How we Hire
This role does not come with relocation assistance.

Booking.com is proud to be an equal opportunity workplace and is an affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. We strive to move well beyond traditional equal opportunity and work to create an environment that allows everyone to thrive.

Pre-Employment Screening

If your application is successful, your personal data may be used for a pre-employment screening check by a third party as permitted by applicable law. Depending on the vacancy and applicable law, a pre-employment screening may include employment history, education and other information (such as media information) that may be necessary for determining your qualifications and suitability for the position.