Senior Machine Learning Engineer (data & Audience Platform Team), Hyderabad

Warner Bros Discovery Warner Bros Discovery · Media · Hyderabad, Telangāna, India · Technology

Senior Machine Learning Engineer on the Data & Audience Platform team responsible for designing and delivering production ML systems for audience targeting, advertising revenue, subscriber engagement, and retention. This role involves end-to-end development from data sourcing to monitoring, including feature engineering, model training, evaluation, and deployment. Key responsibilities include owning ML products like identity resolution and affinity models, designing scalable feature pipelines, architecting inference pipelines, developing various model types, and championing MLOps best practices. The role also involves using AI-assisted development tools and leveraging platforms like Databricks Genie and Snowflake Cortex.

What you'd actually do

  1. Lead end-to-end development of production ML systems: data sourcing, feature engineering, model training, evaluation, deployment, and monitoring.
  2. Own key ML products such as probabilistic identity resolution (matching unauthenticated device IDs and 1P cookies to households/persons with calibrated confidence), single-title affinity (e.g., STAT two-tower retrieval), and audience/propensity models.
  3. Design scalable feature pipelines on Databricks (PySpark, Delta, Workflows/DLT, Unity Catalog) and the WBD feature store, with documented feature contracts, backfill paths, and freshness SLAs.
  4. Develop and optimize models across the ML spectrum: gradient boosting (XGBoost/LightGBM), embedding/two-tower retrieval, neural ranking, probability calibration (e.g., isotonic regression), and probabilistic/graph-based matching.
  5. Champion MLOps best practices: model versioning, champion/challenger promotion, automated retraining triggers, drift detection, and production monitoring with MLflow on Databricks.

Skills

Required

  • Databricks
  • PySpark
  • Delta
  • Workflows/DLT
  • Unity Catalog
  • Snowflake
  • AWS
  • SageMaker
  • XGBoost
  • LightGBM
  • MLflow
  • gradient boosting
  • embedding/two-tower retrieval
  • neural ranking
  • probability calibration
  • probabilistic/graph-based matching
  • feature engineering
  • model training
  • model evaluation
  • model deployment
  • model monitoring
  • MLOps best practices
  • data sourcing
  • inference pipelines
  • batch processing
  • near-real-time processing
  • feature store strategy
  • data quality checks
  • model health dashboards
  • alerting thresholds
  • FinOps cost discipline
  • AI-assisted development
  • Databricks Genie
  • Snowflake Cortex
  • RAG

Nice to have

  • causal-inference techniques
  • uplift/incrementality modeling
  • Data Clean Rooms
  • agentic AI development workflows

What the JD emphasized

  • production ML systems
  • audience targeting
  • personalization
  • feature stores
  • training and serving pipelines
  • MLOps

Other signals

  • production ML systems
  • audience targeting
  • personalization
  • feature stores
  • training and serving pipelines
  • MLOps