Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way.

About the Role

We are seeking a Principal / Distinguished AI/ML Researcher and/or Engineer with deep experience in reasoning, planning, and decision-making systems. This role is ideal for individuals who have architected post-training intelligence frameworks, integrated Large Reasoning Models (LRMs) with Knowledge Graphs, and applied Reinforcement Learning (RL) as a first-class component of adaptive planning and control. You will be responsible for inventing, scaling, and operationalizing intelligent decisioning substrates that blend symbolic and sub-symbolic methods, enabling next-generation AI systems that go beyond pattern recognition into the realm of deliberation, foresight, and agency.

Our mission is to build cognitive AI systems that combine post-trained foundational models, explicit memory and knowledge, and recursive planning strategies to power sophisticated real-world decisioning in personalized environments. You will collaborate across disciplines and influence company-wide AI architecture.

A core dimension of this role is the design and deployment of multi-agent systems, where reasoning, planning, and decisioning are distributed across networks of intelligent agents. You will formulate coherent, synergistic strategies that enable agents to cooperate, negotiate, and align objectives, ensuring that distributed intelligence converges to purposeful, high-quality outcomes across contexts.

Relevance and Impact of This Role

This role has the potential to fundamentally transform Airbnb from a platform that primarily predicts and ranks into a platform capable of reasoning, planning, and making adaptive decisions across complex real-world environments. In the short term, the impact comes from improving decision quality, contextual intelligence, adaptive personalization, and operational coordination across guest and host workflows. AI systems would move beyond static prediction and retrieval toward goal-directed reasoning systems capable of handling ambiguity, constraints, trade-offs, and multi-step planning. Guests would experience more intelligent planning, coordination, and conversational assistance, while hosts and internal teams would benefit from systems capable of adaptive optimization, dynamic recommendations, policy-aware decisioning, and intelligent workflow orchestration. Internally, these systems would significantly improve search, ranking, personalization, support, experimentation, and operational automation by introducing reasoning-aware and planning-aware intelligence into the AI stack.

In the medium term, Airbnb could evolve into a deeply adaptive cognitive marketplace platform where AI systems continuously reason about goals, constraints, user intent, marketplace dynamics, and long-term outcomes. Instead of isolated models making narrow predictions, the platform would increasingly operate through reasoning and planning substrates that coordinate retrieval, memory, reinforcement learning, knowledge graphs, and multi-agent orchestration into unified decision-making systems. Airbnb would gain the ability to support complex multi-stage planning and adaptive coordination across travel, hosting, operations, support, trust, and marketplace optimization. This would create stronger ecosystem intelligence, operational leverage, adaptive personalization, and marketplace resilience while enabling the platform to handle increasingly sophisticated and dynamic real-world interactions.

In the long term, this role helps establish Airbnb’s strategic leadership in cognitive AI systems and distributed intelligence architectures. The technologies developed under this role become the decisioning and reasoning substrate underlying the entire ecosystem — enabling AI systems that can deliberate, coordinate, adapt, simulate outcomes, reason under uncertainty, and make coherent long-horizon decisions across multiple agents and environments. Airbnb would evolve beyond a recommendation and transaction platform into an intelligent coordination and planning ecosystem capable of operating as a large-scale real-world cognitive system. Over time, this could position Airbnb as one of the most advanced applied reasoning and multi-agent intelligence platforms in the consumer internet — where AI systems do not merely predict behavior, but actively reason, plan, and coordinate actions across the marketplace in ways that continuously improve user outcomes, ecosystem health, and long-term strategic adaptability.

What You Will Do

Research & Innovation

Drive foundational and applied research in reasoning engines, planning architectures, and decision-making frameworks at scale in order to incorporate genAI into the ranking / recommendation / personalization stack in both single model to multi-agent ( system ) level intelligence with objective to grow the business (new user growth, abandoned user, long tailed user) in existing and new business areas while supporting Multi-Modal NL → Conversational Interfaces.
Advance techniques in LLM/LRM post-training, reinforcement learning–based decisioning, and knowledge-integrated agents.
Design methods for plan induction, value estimation, and contingency modeling within intelligent agents.
Explore and validate protocols for distributed reasoning and joint planning among cooperative agents in multi-agent systems.

System Design & Architecture

Architect RPD systems that integrate post-trained LLMs/LRMs, graph-structured memory (e.g., KGs), and RL-driven controllers.
Design recursive task planners, search-based or policy-based reasoners, and belief-state trackers that can interoperate with large model substrates.
Ensure modularity and extensibility through multi-agent frameworks, agentic substrates, and declarative planning pipelines.
Define communication protocols, coordination strategies, and cross-agent knowledge alignment mechanisms to foster emergent cooperative intelligence.

Model Development

Build and evolve stateful, dynamic models that combine supervised learning with online/offline reinforcement, simulation-based rollouts, and symbol grounding.
Implement hybrid pipelines that couple learned embeddings, prompted generative models, and graph-theoretic inference.
Optimize systems for adaptive exploration, planning horizon control, and policy robustness.
Develop frameworks for distributed value propagation, multi-agent credit assignment, and global planning from local agents.

Technical Leadership

Set direction for planning/reasoning infrastructure within the AI/ML platform strategy.
Serve as the technical conscience and architectural leader across high-stakes AI initiatives involving autonomous agents or high-fidelity decision pipelines.
Mentor teams in systems thinking, causal modeling, symbolic-connectionist integrations, and long-term planning under uncertainty.
Lead development of multi-agent reasoning systems, defining principles for inter-agent knowledge exchange, goal delegation, and cooperative decision resolution.

Collaboration

Work across disciplines—product, infra, and design—to translate ambiguous product intent into multi-stage reasoning pipelines.
Partner with researchers, ontologists, and ML engineers to encode world knowledge, goals, and values into usable inference artifacts.
Contribute to a company-wide understanding of what it means to make intelligent choices, not just predictions.
Collaborate with internal teams on distributed agent coordination, shared memory protocols, and policy harmonization across decision surfaces.

Operational Excellence

Productionize real-time reasoning loops with low-latency inference, caching, retrieval-augmented generation, and streaming updates to symbolic memory.
Deploy post-training hooks for inserting logic, constraints, and domain priors into existing large models.
Create advanced monitoring, attribution, and evaluation pipelines for agent behavior and decision quality.
Operationalize multi-agent orchestration, ensuring reliable and fault-tolerant communication and decision propagation.

Minimum Qualifications

Masters or equivalent in Computer Science, AI, Cognitive Science, or related fields.
Recent published work or patents in AI, Cognitive Science, or related fields.
15+ years in AI/ML, including post-training architectures and production-scale reasoning systems.
Advanced coding proficiency in Java, Python, C++, or similar, with experience in ML/RL frameworks (e.g., PyTorch, Ray, JAX, RLlib) at scale.
Proven experience integrating LLMs/LRMs with Knowledge Graphs or structured world models.
Deep understanding of Reinforcement Learning and its application to decisioning and planning.
Fluency in hybrid model architectures: connectionist-symbolic fusion, retrieval-based agents, or goal-directed transformers.
Experience working on multi-agent coordination, distributed RL, or cooperative inference systems.

Preferred Qualifications

Ph.D. in AI, Machine Learning, Robotics, Cognitive Systems, or related areas.
Published work or patents in multi-agent reasoning, plan synthesis, knowledge-augmented learning, or generative control.
Experience in cognitive architectures, neuro-symbolic systems, or agent-based simulation environments.
Demonstrated ability to lead cross-functional research-to-production transitions.
Experience with memory architectures, task graphs, or semantic program induction.
Prior work on distributed intelligence platforms with explicit agent interaction models and collective decision-making logic.

Your Location:

This position is US - Remote Eligible. The role may include occasional work at an Airbnb office or attendance at offsites, as agreed to with your manager. While the position is Remote Eligible, you must live in a state where Airbnb, Inc. has a registered entity. Click here for the up-to-date list of excluded states. This list is continuously evolving, so please check back with us if the state you live in is on the exclusion list . If your position is employed by another Airbnb entity, your recruiter will inform you what states you are eligible to work from.

Our Commitment To Inclusion & Belonging:

Airbnb is committed to working with the broadest talent pool possible. We believe diverse ideas foster innovation and engagement, and allow us to attract creatively-led people, and to develop the best products, services and solutions. All qualified individuals are encouraged to apply.

We strive to also provide a disability inclusive application and interview process. If you are a candidate with a disability and require reasonable accommodation in order to submit an application, please contact us at: reasonableaccommodations@airbnb.com. Please include your full name, the role you’re applying for and the accommodation necessary to assist you with the recruiting process.

We ask that you only reach out to us if you are a candidate whose disability prevents you from being able to complete our online application.

How We'll Take Care of You:

Our job titles may span more than one career level. The actual base pay is dependent upon many factors, such as: training, transferable skills, work experience, business needs and market demands. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.

Pay Range

$296,000—$370,000 USD

About the Role

Relevance and Impact of This Role

What You Will Do

Research & Innovation

Drive foundational and applied research in reasoning engines, planning architectures, and decision-making frameworks at scale in order to incorporate genAI into the ranking / recommendation / personalization stack in both single model to multi-agent ( system ) level intelligence with objective to grow the business (new user growth, abandoned user, long tailed user) in existing and new business areas while supporting Multi-Modal NL → Conversational Interfaces.
Advance techniques in LLM/LRM post-training, reinforcement learning–based decisioning, and knowledge-integrated agents.
Design methods for plan induction, value estimation, and contingency modeling within intelligent agents.
Explore and validate protocols for distributed reasoning and joint planning among cooperative agents in multi-agent systems.

System Design & Architecture

Architect RPD systems that integrate post-trained LLMs/LRMs, graph-structured memory (e.g., KGs), and RL-driven controllers.
Design recursive task planners, search-based or policy-based reasoners, and belief-state trackers that can interoperate with large model substrates.
Ensure modularity and extensibility through multi-agent frameworks, agentic substrates, and declarative planning pipelines.
Define communication protocols, coordination strategies, and cross-agent knowledge alignment mechanisms to foster emergent cooperative intelligence.

Model Development

Build and evolve stateful, dynamic models that combine supervised learning with online/offline reinforcement, simulation-based rollouts, and symbol grounding.
Implement hybrid pipelines that couple learned embeddings, prompted generative models, and graph-theoretic inference.
Optimize systems for adaptive exploration, planning horizon control, and policy robustness.
Develop frameworks for distributed value propagation, multi-agent credit assignment, and global planning from local agents.

Technical Leadership

Set direction for planning/reasoning infrastructure within the AI/ML platform strategy.
Serve as the technical conscience and architectural leader across high-stakes AI initiatives involving autonomous agents or high-fidelity decision pipelines.
Mentor teams in systems thinking, causal modeling, symbolic-connectionist integrations, and long-term planning under uncertainty.
Lead development of multi-agent reasoning systems, defining principles for inter-agent knowledge exchange, goal delegation, and cooperative decision resolution.

Collaboration

Work across disciplines—product, infra, and design—to translate ambiguous product intent into multi-stage reasoning pipelines.
Partner with researchers, ontologists, and ML engineers to encode world knowledge, goals, and values into usable inference artifacts.
Contribute to a company-wide understanding of what it means to make intelligent choices, not just predictions.
Collaborate with internal teams on distributed agent coordination, shared memory protocols, and policy harmonization across decision surfaces.

Operational Excellence

Productionize real-time reasoning loops with low-latency inference, caching, retrieval-augmented generation, and streaming updates to symbolic memory.
Deploy post-training hooks for inserting logic, constraints, and domain priors into existing large models.
Create advanced monitoring, attribution, and evaluation pipelines for agent behavior and decision quality.
Operationalize multi-agent orchestration, ensuring reliable and fault-tolerant communication and decision propagation.

Minimum Qualifications

Masters or equivalent in Computer Science, AI, Cognitive Science, or related fields.
Recent published work or patents in AI, Cognitive Science, or related fields.
15+ years in AI/ML, including post-training architectures and production-scale reasoning systems.
Advanced coding proficiency in Java, Python, C++, or similar, with experience in ML/RL frameworks (e.g., PyTorch, Ray, JAX, RLlib) at scale.
Proven experience integrating LLMs/LRMs with Knowledge Graphs or structured world models.
Deep understanding of Reinforcement Learning and its application to decisioning and planning.
Fluency in hybrid model architectures: connectionist-symbolic fusion, retrieval-based agents, or goal-directed transformers.
Experience working on multi-agent coordination, distributed RL, or cooperative inference systems.

Preferred Qualifications

Ph.D. in AI, Machine Learning, Robotics, Cognitive Systems, or related areas.
Published work or patents in multi-agent reasoning, plan synthesis, knowledge-augmented learning, or generative control.
Experience in cognitive architectures, neuro-symbolic systems, or agent-based simulation environments.
Demonstrated ability to lead cross-functional research-to-production transitions.
Experience with memory architectures, task graphs, or semantic program induction.
Prior work on distributed intelligence platforms with explicit agent interaction models and collective decision-making logic.

Your Location:

Our Commitment To Inclusion & Belonging:

We ask that you only reach out to us if you are a candidate whose disability prevents you from being able to complete our online application.

How We'll Take Care of You:

Pay Range

$296,000—$370,000 USD

Principal Ai/ml Researcher / Engineer Reasoning, Planning, and Decision-making Systems

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

About the Role

Relevance and Impact of This Role

What You Will Do

Research & Innovation

System Design & Architecture

Model Development

Technical Leadership

Collaboration

Operational Excellence

Minimum Qualifications

Preferred Qualifications

About the Role

Relevance and Impact of This Role

What You Will Do

Research & Innovation

System Design & Architecture

Model Development

Technical Leadership

Collaboration

Operational Excellence

Minimum Qualifications

Preferred Qualifications