Data Engineer II

Expedia Expedia · Hospitality · Gurgaon, India

Data Engineer II role at Expedia Group focused on building and maintaining data products and pipelines for pricing, marketing, and ML use cases. The role involves designing scalable data models, ETL/ELT processes, and APIs, collaborating with data scientists and product teams, and optimizing data workflows. It also requires integrating and operating AI/ML-enabled solutions, with familiarity with AI tools and concepts.

What you'd actually do

  1. Design, build, and maintain reliable, high-quality data pipelines and ETL/ELT processes that enable analytics, reporting, and data-driven product features across multiple domains.
  2. Implement scalable data models, storage patterns, and APIs that support batch and streaming workloads while ensuring data quality, accuracy, and consistency.
  3. Collaborate with software engineers, data scientists, and product teams to understand data needs, translate requirements into technical solutions, and deliver well-documented, reusable datasets.
  4. Develop, optimize, and monitor data workflows and jobs for performance, cost efficiency, and operational robustness, including alerting, logging, and failure recovery.
  5. Apply and extend standard data engineering practices for security, privacy, governance, and compliance, including metadata management, lineage, and access control.
  6. Safely integrate and operate AI/MLenabled solutions that improve outcomes, including familiarity with AI-driven systems, tools, or workflows and applying AI/ML concepts to real world products.

Skills

Required

  • Scala
  • Java
  • Python
  • Spark
  • Airflow
  • AWS Data Stack
  • SQL
  • production-grade data pipelines
  • data models
  • data services
  • monitoring
  • troubleshooting
  • improving reliability and performance

Nice to have

  • designing and optimizing large-scale data pipelines
  • data modelling
  • partitioning
  • performance tuning
  • high-volume datasets
  • distributed data processing
  • storage
  • streaming technologies
  • real-time and batch data products
  • improving data quality
  • observability
  • governance through automation
  • standards
  • tooling
  • Claude
  • Cursor

What the JD emphasized

  • AI/MLenabled solutions
  • AI-driven systems, tools, or workflows
  • applying AI/ML concepts to real world products