What you'd actually do

Lead & Grow: Hire, mentor, and retain a high-performing team of ML engineers / systems-oriented engineers working on model optimization and ML efficiency.

Set Technical Direction: Define the roadmap for training optimization, inference optimization, launch-readiness tooling, and reusable efficiency primitives across Ads ML.

Deliver Measurable Wins: Drive reductions in model training time, online latency, serving cost, and infra-driven launch risk.

Build Systems and Tooling: Guide the development of profiling, benchmarking, load testing, observability, cost analysis, debugging, and efficiency certification systems.

Operate in the Critical Path: Partner with model owners and platform teams to accelerate high-priority launches and remove bottlenecks from the path to production.

Skills

Required

ML engineering
systems optimization
organizational leverage
model optimization
training efficiency
inference optimization
GPU enablement
load testing
model performance tooling
efficiency guardrails
hiring
mentoring
team leadership
roadmap definition
profiling
benchmarking
observability
cost analysis
debugging
distributed systems
production-scale ML systems
reliability
speed
cost
scale
service provider mindset
building reusable systems
technical communication

Nice to have

Ads experience
ads ranking
recommender systems
marketplace ML
GPU training and serving migrations
PyTorch
distributed training frameworks
kernel optimization
performance optimization
efficiency benchmarking frameworks
launch certification frameworks
ML platform
applied modeling

What the JD emphasized

model optimization

training efficiency

inference optimization

efficiency guardrails

model training time

online latency

serving cost

launch risk

profiling

benchmarking

load testing

observability

cost analysis

debugging

efficiency certification systems

optimization

performance debugging

launch safety

Deep ML Engineering Experience

Hands-on Optimization Background

Distributed Systems Fluency

Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 126 million daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit www.redditinc.com.

Reddit has a flexible workforce! If you happen to live close to one of our physical office locations our doors are open for you to come into the office as often as you'd like. Don't live near one of our offices? No worries: You can apply to work remotely in any country in which we have a physical presence.

About the Role

Reddit is building a dedicated Ads ML Efficiency function to make model training and inference materially faster, cheaper, safer, and more scalable. As the Engineering Manager for this team, you will lead a group focused on model optimization, training efficiency, GPU enablement, load testing, model performance tooling, and efficiency guardrails across Ads ML.

This role sits at the intersection of ML modeling, systems optimization, and organizational leverage. You will partner closely with ranking teams, ML Platform teams and serving owners to identify the highest-value bottlenecks, land measurable efficiency wins, and build the tooling and operating mechanisms that make those wins repeatable.

What you’ll do:

Lead & Grow: Hire, mentor, and retain a high-performing team of ML engineers / systems-oriented engineers working on model optimization and ML efficiency.
**Set Technical Direction: **Define the roadmap for training optimization, inference optimization, launch-readiness tooling, and reusable efficiency primitives across Ads ML.
Deliver Measurable Wins: Drive reductions in model training time, online latency, serving cost, and infra-driven launch risk.
Build Systems and Tooling: Guide the development of profiling, benchmarking, load testing, observability, cost analysis, debugging, and efficiency certification systems.
Operate in the Critical Path: Partner with model owners and platform teams to accelerate high-priority launches and remove bottlenecks from the path to production.
**Shape the Team’s Evolution: **Balance near-term white-glove optimization work with medium-term platformization and automation.
**Build XFN Alignment: **Work closely with MLP, AMP, Ranking, and serving teams to clarify boundaries, upstream generic wins, and keep Ads needs on track.
**Raise the Bar: **Establish engineering rigor around measurement, performance debugging, launch safety, and technical decision-making for efficiency work.

What we’re looking for:

**Deep ML Engineering Experience: **The candidate should have been close to the models themselves and understand training, serving, debugging, and optimization in depth.
**Hands-on Optimization Background: **Direct experience improving training loops, serving systems, profiling workflows, model/inference efficiency, or GPU utilization.
Strong Managerial Ability: Experience building and leading teams, coaching engineers, managing delivery, and making prioritization tradeoffs under ambiguity.
**Distributed Systems Fluency: **Proven ability to reason about production-scale ML systems and the tradeoffs that govern reliability, speed, cost, and scale.
**Customer and Platform Instincts: **Able to work as a service provider to modeling teams while still building reusable systems rather than only heroic one-offs.
**Strong Communication: **Can explain technical tradeoffs clearly to engineers, PMs, and senior stakeholders.
Ads experience: Experience in ads ranking, recommender systems, marketplace ML, or adjacent production ML domains is strongly preferred.

Nice-to-have:

Experience with GPU training and serving migrations.
Experience with PyTorch, distributed training frameworks, or kernel/performance optimization.
Experience building efficiency benchmarking or launch certification frameworks.
Experience working in organizations where ML platform and applied modeling responsibilities are split across multiple teams.

Benefits:

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave

Pay Transparency:

This job posting may span more than one career level.

In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Additionally, Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave. To learn more, please visit https://www.redditinc.com/careers/.

To provide greater transparency to candidates, we share base salary ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar stage growth companies. Final offer amounts are determined by multiple factors including, skills, depth of work experience and relevant licenses/credentials, and may vary from the amounts listed below.

The base salary range for this position is:

$230,000—$322,000 USD

In select roles and locations, the interviews will be recorded, transcribed and summarized by artificial intelligence (AI). You will have the opportunity to opt out of recording, transcription and summarization prior to any scheduled interviews.

During the interview, we will collect the following categories of personal information: Identifiers, Professional and Employment-Related Information, Sensory Information (audio/video recording), and any other categories of personal information you choose to share with us. We will use this information to evaluate your application for employment or an independent contractor role, as applicable. We will not sell your personal information or disclose it to any third party for their marketing purposes. We will delete any recording of your interview promptly after making a hiring decision. For more information about how we will handle your personal information, including our retention of it, please refer to our Candidate Privacy Policy for Potential Employees and Contractors.

Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve. Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.

About the Role

What you’ll do:

Lead & Grow: Hire, mentor, and retain a high-performing team of ML engineers / systems-oriented engineers working on model optimization and ML efficiency.
**Set Technical Direction: **Define the roadmap for training optimization, inference optimization, launch-readiness tooling, and reusable efficiency primitives across Ads ML.
Deliver Measurable Wins: Drive reductions in model training time, online latency, serving cost, and infra-driven launch risk.
Build Systems and Tooling: Guide the development of profiling, benchmarking, load testing, observability, cost analysis, debugging, and efficiency certification systems.
Operate in the Critical Path: Partner with model owners and platform teams to accelerate high-priority launches and remove bottlenecks from the path to production.
**Shape the Team’s Evolution: **Balance near-term white-glove optimization work with medium-term platformization and automation.
**Build XFN Alignment: **Work closely with MLP, AMP, Ranking, and serving teams to clarify boundaries, upstream generic wins, and keep Ads needs on track.
**Raise the Bar: **Establish engineering rigor around measurement, performance debugging, launch safety, and technical decision-making for efficiency work.

What we’re looking for:

**Deep ML Engineering Experience: **The candidate should have been close to the models themselves and understand training, serving, debugging, and optimization in depth.
**Hands-on Optimization Background: **Direct experience improving training loops, serving systems, profiling workflows, model/inference efficiency, or GPU utilization.
Strong Managerial Ability: Experience building and leading teams, coaching engineers, managing delivery, and making prioritization tradeoffs under ambiguity.
**Distributed Systems Fluency: **Proven ability to reason about production-scale ML systems and the tradeoffs that govern reliability, speed, cost, and scale.
**Customer and Platform Instincts: **Able to work as a service provider to modeling teams while still building reusable systems rather than only heroic one-offs.
**Strong Communication: **Can explain technical tradeoffs clearly to engineers, PMs, and senior stakeholders.
Ads experience: Experience in ads ranking, recommender systems, marketplace ML, or adjacent production ML domains is strongly preferred.

Nice-to-have:

Experience with GPU training and serving migrations.
Experience with PyTorch, distributed training frameworks, or kernel/performance optimization.
Experience building efficiency benchmarking or launch certification frameworks.
Experience working in organizations where ML platform and applied modeling responsibilities are split across multiple teams.

Benefits:

Comprehensive Healthcare Benefits and Income Replacement Programs
401k with Employer Match
Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
Family Planning Support
Gender-Affirming Care
Mental Health & Coaching Benefits
Flexible Vacation & Paid Volunteer Time Off
Generous Paid Parental Leave

Pay Transparency:

This job posting may span more than one career level.

The base salary range for this position is:

$230,000—$322,000 USD

Engineering Manager, Ads ML Efficiency

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About the Role

What you’ll do:

What we’re looking for:

Nice-to-have:

About the Role

What you’ll do:

What we’re looking for:

Nice-to-have: