What you'd actually do

Design, build, and operate the Agent SDK and MCP Gateway that Netflix engineers use to build, deploy, and run AI agents in production.

Build agents and agent infrastructure across the full lifecycle — plan/act/observe loops, tool and MCP integrations, deployment, and day-2 operations.

Make evaluation a first-class part of the platform: build the tracing, eval suites, and quality signals that let teams measure agents, catch regressions, and iterate to make them better.

Own reliability, observability, and guardrails for non-deterministic systems running at very high scale

Lead cross-functional initiatives with ML scientists, data scientists, product managers, and other AI Platform teams.

Skills

Required

8+ years of software engineering experience
Hands-on experience building, deploying, operating, AND evaluating LLM agents in production
Experience with one or more agent frameworks/SDKs (Strands, OpenAI Agents SDK, Anthropic Claude Agent SDK, LangGraph, pydantic-ai, CrewAI, Google ADK)
Experience with tool/function calling and MCP
Experience with LLM/agent evaluation and observability — building eval suites, tracing, and quality measurement, then iterating on results (Braintrust, LangSmith, W&B, or equivalent)
Strong experience building SDKs and APIs for internal or external developers
Strong fundamentals in building and operating scalable, observable, fault-tolerant distributed systems
Proficiency in Python
Proficiency in one of Java, Go, C/C++, Rust, or Zig

Nice to have

Familiarity with Temporal, FastAPI, PostgreSQL, Kubernetes

At Netflix, our mission is to entertain the world. Together, we are writing the next episode - pushing the boundaries of storytelling, global fandom and making the unimaginable a reality. We are a dream team obsessed with the uncomfortable excitement of discovering what happens when you merge creativity, intuition and cutting-edge technology. Come be a part of what’s next.

At Netflix, we want to entertain the world, and we're constantly innovating on how entertainment is imagined, created, and delivered to a global audience. Increasingly, that innovation is powered by AI — and Netflix is making a deliberate bet to become AI Native, with AI woven into how we build products, create content, and run the company.

The Opportunity

The Agent Platform team gives Netflix engineers the infrastructure to go from zero to a production-grade AI agent without reinventing the wheel. We own the foundational building blocks the whole company builds on: the Model Gateway (unified access to external LLMs like Claude, GPT, and Gemini), the Assistance API for conversational use cases, the MCP Gateway that connects agents to Netflix's internal systems and knowledge, our Agent SDK, and an end-to-end evaluation stack.

The work has moved well beyond chat completions. Teams across Netflix are now building agents — systems that plan, call tools, observe results, and iterate — and they depend on us for the infrastructure to do that reliably and to know whether their agents are actually any good. We're a small team with outsized leverage: what we ship becomes the foundation for AI across all of Netflix.

What you will do:

Design, build, and operate the Agent SDK and MCP Gateway that Netflix engineers use to build, deploy, and run AI agents in production.
Build agents and agent infrastructure across the full lifecycle — plan/act/observe loops, tool and MCP integrations, deployment, and day-2 operations.
Make evaluation a first-class part of the platform: build the tracing, eval suites, and quality signals that let teams measure agents, catch regressions, and iterate to make them better.
Own reliability, observability, and guardrails for non-deterministic systems running at very high scale
Lead cross-functional initiatives with ML scientists, data scientists, product managers, and other AI Platform teams.
Rapidly iterate with users to improve the developer experience while establishing durable foundational capabilities.

Desired Background:

8+ years of software engineering experience with a track record of delivering quality results.
Hands-on experience building, deploying, operating, AND evaluating LLM agents in production — not just chat-completion apps or prototypes.
Experience with one or more agent frameworks/SDKs (Strands, OpenAI Agents SDK, Anthropic Claude Agent SDK, LangGraph, pydantic-ai, CrewAI, Google ADK) and with tool/function calling and MCP.
Experience with LLM/agent evaluation and observability — building eval suites, tracing, and quality measurement, then iterating on results (Braintrust, LangSmith, W&B, or equivalent).
Strong experience building SDKs and APIs for internal or external developers.
Strong fundamentals in building and operating scalable, observable, fault-tolerant distributed systems.
Proficiency in Python (and Python packaging tooling) plus one of Java, Go, C/C++, Rust, or Zig. Familiarity with our stack — Temporal, FastAPI, PostgreSQL, Kubernetes — is a plus.
Experience with large-scale build, release, CI/CD, and observability methods.

Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $466,000.00 - $750,000.00. This compensation range will vary based on location.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more details about our Benefits here.

Netflix is a unique culture and environment. Learn more here.

Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.

Job is open for no less than 7 days and will be removed when the position is filled.