What you'd actually do

Profile and optimize the ad serving runtime for latency, throughput, and resource efficiency across the full request lifecycle: targeting evaluation, policy enforcement, ad selection, and response serialization

Identify and eliminate performance bottlenecks across services: CPU hotspots, GC pressure, memory allocation patterns, thread contention, and network overhead

Design and run load tests, squeeze tests, and capacity models to validate system behavior under peak and burst traffic (including Live events at NFL scale)

Establish performance baselines and regression detection: automated benchmarking in CI/CD to catch regressions before they reach production

Instrument comprehensive latency telemetry, tracing, and profiling across the ad request lifecycle to enable data-driven optimization

Skills

Required

building and optimizing distributed systems and backend services at scale
performance engineering
profiling
JVM internals
bottleneck analysis
latency engineering
load tests
squeeze tests
capacity models
high-throughput, latency-sensitive systems
ad servers
SSPs
DSPs
real-time bidding infrastructure
Java
Kotlin
JVM languages
event-driven architectures
Kafka
Flink
stream processing
throughput optimization
consumer lag management
ad serving concepts
targeting
frequency capping
publisher controls
programmatic protocols

Nice to have

CTV constraints
server-side ad insertion
live event ad serving at scale
logging and telemetry frameworks
high-throughput request pipelines
Multi-region deployment
active-active architectures
failover
regional performance variance analysis
Chaos engineering
SRE practices
error budgets
game days
fault injection
squeeze testing
hardware-aware optimization
automated performance regression detection
CI/CD pipelines

At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next.

We launched a new ad-supported tier in November 2022 and are building an in-house world-class ad tech ecosystem to offer our members more choices in consuming their content. Our new tier allows us to attract new members at a lower price point while also creating a compelling path for advertisers to reach deeply engaged audiences.

Our Team

The Ad Server Platform team sits within the Ad Serving & Decisioning org at Netflix Ads. We build and maintain the robust, scalable, and efficient ad serving infrastructure that forms the backbone of Netflix’s advertising ecosystem. Our mission is to ensure seamless and reliable delivery of ads across all platforms, devices, and viewing contexts.

Our work spans the core services and frameworks that the broader Ads Platform depends on: supply-agnostic controls, policies, frequency caps, applying floor prices, ad podding rules, competitive separation, rule-based targeting, event processing, and the logging frameworks. We are looking for a performance-minded systems engineer to ensure our ad serving infrastructure operates at peak efficiency under demanding latency and throughput requirements, serving teams across the ads ecosystem.

What You'll Do

Profile and optimize the ad serving runtime for latency, throughput, and resource efficiency across the full request lifecycle: targeting evaluation, policy enforcement, ad selection, and response serialization
Identify and eliminate performance bottlenecks across services: CPU hotspots, GC pressure, memory allocation patterns, thread contention, and network overhead
Design and run load tests, squeeze tests, and capacity models to validate system behavior under peak and burst traffic (including Live events at NFL scale)
Establish performance baselines and regression detection: automated benchmarking in CI/CD to catch regressions before they reach production
Instrument comprehensive latency telemetry, tracing, and profiling across the ad request lifecycle to enable data-driven optimization
Optimize the rule engine framework and frequency management service for minimal overhead per request as policy complexity grows
Work closely with the programmatic team on bid request/response performance: QPS management, connection pooling, timeout tuning, and load shedding under pressure
Drive platform reliability through an efficiency lens: capacity planning, autoscaling tuning, graceful degradation, and cost-per-request optimization
Own performance SLOs and budgets: define latency budgets per component, track them, and hold teams accountable when budgets are exceeded
Partner with infrastructure and platform teams to adopt runtime improvements, evaluate hardware configurations, and tune GC strategies

Skills & Experience We're Seeking

7+ years building and optimizing distributed systems and backend services at scale
Deep experience with performance engineering: profiling (async-profiler, JFR, flamegraphs), JVM internals (GC tuning, JIT compilation, memory models), and systematic bottleneck analysis
Strong understanding of latency engineering: cache hierarchies, connection pooling, async I/O, thread pool sizing, and tail latency reduction
Experience designing and running load tests, squeeze tests, and capacity models for high-throughput, latency-sensitive systems
Built or operated ad servers, SSPs, DSPs, or real-time bidding infrastructure
Proficiency in Java, Kotlin, or similar JVM languages with deep understanding of runtime behavior beyond just writing correct code
Experience with event-driven architectures: Kafka, Flink, or similar stream processing, with a focus on throughput optimization and consumer lag management
Understanding of ad serving concepts: targeting, frequency capping, publisher controls, programmatic protocols
Ability to operate in an environment that is a mix of big-tech scale and startup speed

Nice to Haves

Experience with CTV constraints: server-side ad insertion, live event ad serving at scale (NFL-sized audiences)
Built or improved logging and telemetry frameworks for high-throughput request pipelines with minimal performance overhead
Multi-region deployment experience: active-active architectures, failover, and regional performance variance analysis
Chaos engineering or SRE practices: error budgets, game days, fault injection, squeeze testing
Experience with hardware-aware optimization
Built automated performance regression detection in CI/CD pipelines

Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $388,000.00 - $619,000.00. This compensation range will vary based on location.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more details about our Benefits here.

Netflix is a unique culture and environment. Learn more here.

Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.

Our Team

What You'll Do

Identify and eliminate performance bottlenecks across services: CPU hotspots, GC pressure, memory allocation patterns, thread contention, and network overhead

Design and run load tests, squeeze tests, and capacity models to validate system behavior under peak and burst traffic (including Live events at NFL scale)

Establish performance baselines and regression detection: automated benchmarking in CI/CD to catch regressions before they reach production

Instrument comprehensive latency telemetry, tracing, and profiling across the ad request lifecycle to enable data-driven optimization

Optimize the rule engine framework and frequency management service for minimal overhead per request as policy complexity grows

Work closely with the programmatic team on bid request/response performance: QPS management, connection pooling, timeout tuning, and load shedding under pressure

Drive platform reliability through an efficiency lens: capacity planning, autoscaling tuning, graceful degradation, and cost-per-request optimization

Own performance SLOs and budgets: define latency budgets per component, track them, and hold teams accountable when budgets are exceeded

Partner with infrastructure and platform teams to adopt runtime improvements, evaluate hardware configurations, and tune GC strategies

Skills & Experience We're Seeking

7+ years building and optimizing distributed systems and backend services at scale

Deep experience with performance engineering: profiling (async-profiler, JFR, flamegraphs), JVM internals (GC tuning, JIT compilation, memory models), and systematic bottleneck analysis

Strong understanding of latency engineering: cache hierarchies, connection pooling, async I/O, thread pool sizing, and tail latency reduction

Experience designing and running load tests, squeeze tests, and capacity models for high-throughput, latency-sensitive systems

Built or operated ad servers, SSPs, DSPs, or real-time bidding infrastructure

Proficiency in Java, Kotlin, or similar JVM languages with deep understanding of runtime behavior beyond just writing correct code

Experience with event-driven architectures: Kafka, Flink, or similar stream processing, with a focus on throughput optimization and consumer lag management

Understanding of ad serving concepts: targeting, frequency capping, publisher controls, programmatic protocols

Ability to operate in an environment that is a mix of big-tech scale and startup speed

Nice to Haves

Experience with CTV constraints: server-side ad insertion, live event ad serving at scale (NFL-sized audiences)

Built or improved logging and telemetry frameworks for high-throughput request pipelines with minimal performance overhead

Multi-region deployment experience: active-active architectures, failover, and regional performance variance analysis

Chaos engineering or SRE practices: error budgets, game days, fault injection, squeeze testing

Experience with hardware-aware optimization

Built automated performance regression detection in CI/CD pipelines

Netflix is a unique culture and environment. Learn more here.

Job is open for no less than 7 days and will be removed when the position is filled.

Performance Systems Engineer 5 - Ad Server Platform

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Our Team

What You'll Do

Skills & Experience We're Seeking

Nice to Haves

Our Team

What You'll Do

Skills & Experience We're Seeking

Nice to Haves