At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next.

We launched a new ad-supported tier in November 2022 and are building an in-house world-class ad tech ecosystem to offer our members more choices in consuming their content. Our new tier allows us to attract new members at a lower price point while also creating a compelling path for advertisers to reach deeply engaged audiences.

Our Team

The Decisioning & Optimization engineering team owns the systems that determine which ad wins every impression, at what price, and how campaign budgets deliver across all inventory surfaces. Our work spans three platform areas:

ML infrastructure for model serving: real-time inference at 1M+ QPS, multi-model parallel evaluation, feature hydration, model lifecycle from canary deployment through production monitoring
Auction, ranking, and scoring: multi-stage candidate selection, scoring, bid valuation, dynamic pricing, and podding
Budget, pacing, and bidding: control systems for delivery optimization, budget planning, and bid computation

We are scaling from a handful of production models to 10+ while maintaining sub-20ms P99 inference budgets. We are looking for an ML engineer who can build and operate the serving infrastructure these models run on, and who understands the ads decisioning context well enough to make the right engineering tradeoffs.

What You'll Do

Build and operate end-to-end ML model serving infrastructure for real-time ad decisioning: model publishing, packaging, validation, deployment into the serving stack with zero-downtime hot-swap
Scale the inference path to support dozens of concurrent models on every ad request at 1M+ QPS with strict latency budgets, including batching strategies, CPU/GPU allocation, model versioning, and fallback tiers
Design and optimize the feature serving path: feature hydration from Chronon, Signal Service, and real-time streams with sub-10ms P99 fetch latency and online/offline consistency
Productionize scoring and ranking models for multi-stage ad selection (retrieval, early ranking, full scoring) and integrate model outputs into auction
Build model performance monitoring in production: inference latency, prediction distribution shifts, feature drift detection, score calibration, and regression detection before revenue impact
Partner closely with Data Science & Platform teams
Build simulation infrastructure to replay production traffic against candidate models offline, enabling validation of marketplace changes before live rollout
Drive operational excellence for ML systems: reliability, observability, capacity planning, incident response, and scaling for live events with 35M+ concurrent viewers

Skills & Experience We're Seeking

7+ years of software engineering experience; 3+ years focused on ML infrastructure, model serving, or ML platform work in an ads or real-time decisioning context
Built and operated real-time model serving systems at high QPS with sub-20ms latency: online inference, feature stores, model registries, model hot-swap, canary and shadow rollout
Proficiency in Java, Python, or Scala with a solid understanding of multi-threading, memory management, and performance optimization for latency-critical paths
Hands-on with ML serving frameworks: serialization, runtime optimization, and deployment constraints
Experience with feature engineering pipelines for real-time systems: online/offline consistency, hydration strategies, caching, and freshness tradeoffs
Strong understanding of model monitoring in production: drift detection, prediction distribution analysis, calibration, and latency profiling
Comfortable working at the boundary between ML research and production engineering: can take a model artifact and turn it into a production-ready service that meets SLA
Demonstrated ability to operate in an environment that requires both big-tech scale and startup speed

Nice to Haves

Ads domain experience: ranking models, bid scoring, reserve pricing, yield optimization, dynamic allocation across guaranteed and non-guaranteed inventory
Experience with auction mechanics: multi-stage ranking, bid shading, bid prediction, marketplace competition dynamics
Built or improved budget pacing and delivery control systems
Built simulation or counterfactual testing platforms for marketplace or auction systems
Experience with A/B testing infrastructure for model rollouts: online experiments, holdout groups, interference-aware evaluation in marketplace settings
Familiar with CTV constraints: server-side ad insertion, live event ad serving at scale, burst traffic patterns
JVM ecosystem

Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $466,000.00 - $750,000.00.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more details about our Benefits here.

Netflix is a unique culture and environment. Learn more here.

Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Our Team

ML infrastructure for model serving: real-time inference at 1M+ QPS, multi-model parallel evaluation, feature hydration, model lifecycle from canary deployment through production monitoring
Auction, ranking, and scoring: multi-stage candidate selection, scoring, bid valuation, dynamic pricing, and podding
Budget, pacing, and bidding: control systems for delivery optimization, budget planning, and bid computation

What You'll Do

Build and operate end-to-end ML model serving infrastructure for real-time ad decisioning: model publishing, packaging, validation, deployment into the serving stack with zero-downtime hot-swap
Scale the inference path to support dozens of concurrent models on every ad request at 1M+ QPS with strict latency budgets, including batching strategies, CPU/GPU allocation, model versioning, and fallback tiers
Design and optimize the feature serving path: feature hydration from Chronon, Signal Service, and real-time streams with sub-10ms P99 fetch latency and online/offline consistency
Productionize scoring and ranking models for multi-stage ad selection (retrieval, early ranking, full scoring) and integrate model outputs into auction
Build model performance monitoring in production: inference latency, prediction distribution shifts, feature drift detection, score calibration, and regression detection before revenue impact
Partner closely with Data Science & Platform teams
Build simulation infrastructure to replay production traffic against candidate models offline, enabling validation of marketplace changes before live rollout
Drive operational excellence for ML systems: reliability, observability, capacity planning, incident response, and scaling for live events with 35M+ concurrent viewers

Skills & Experience We're Seeking

7+ years of software engineering experience; 3+ years focused on ML infrastructure, model serving, or ML platform work in an ads or real-time decisioning context
Built and operated real-time model serving systems at high QPS with sub-20ms latency: online inference, feature stores, model registries, model hot-swap, canary and shadow rollout
Proficiency in Java, Python, or Scala with a solid understanding of multi-threading, memory management, and performance optimization for latency-critical paths
Hands-on with ML serving frameworks: serialization, runtime optimization, and deployment constraints
Experience with feature engineering pipelines for real-time systems: online/offline consistency, hydration strategies, caching, and freshness tradeoffs
Strong understanding of model monitoring in production: drift detection, prediction distribution analysis, calibration, and latency profiling
Comfortable working at the boundary between ML research and production engineering: can take a model artifact and turn it into a production-ready service that meets SLA
Demonstrated ability to operate in an environment that requires both big-tech scale and startup speed

Nice to Haves

Ads domain experience: ranking models, bid scoring, reserve pricing, yield optimization, dynamic allocation across guaranteed and non-guaranteed inventory
Experience with auction mechanics: multi-stage ranking, bid shading, bid prediction, marketplace competition dynamics
Built or improved budget pacing and delivery control systems
Built simulation or counterfactual testing platforms for marketplace or auction systems
Experience with A/B testing infrastructure for model rollouts: online experiments, holdout groups, interference-aware evaluation in marketplace settings
Familiar with CTV constraints: server-side ad insertion, live event ad serving at scale, burst traffic patterns
JVM ecosystem

Netflix is a unique culture and environment. Learn more here.

Machine Learning Engineer 5 - Decisioning & Optimization

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

Our Team

What You'll Do

Skills & Experience We're Seeking

Nice to Haves

Our Team

What You'll Do

Skills & Experience We're Seeking

Nice to Haves