What you'd actually do

Lead an independent, high-impact research agenda on large language models and agentic systems, owning projects from early hypothesis through model training, evaluation, and production deployment

Design and execute large-scale post-training experiments using supervised fine-tuning, reinforcement learning from human feedback (RLHF), RLAIF, DPO, and emerging alignment techniques — with a focus on improving multi-step reasoning, planning, and tool use in enterprise agentic workflows

Build novel evaluation benchmarks and methodologies that push beyond existing limitations, establishing rigorous measures for how well models perform on complex, real-world enterprise tasks

Develop scalable data synthesis and curation pipelines that generate the high-quality training signal driving model improvement — including LLM-as-judge frameworks, synthetic data generation, and adversarial dataset construction

Shape WRITER's model architecture and training roadmap by translating your research insights into concrete improvements to our enterprise-grade LLMs, working hand-in-hand with research engineering and product teams

Skills

Required

7+ years of hands-on ML research experience
deep expertise in large language model pre-training and post-training
trained models at scale
debugged distributed jobs
shipped improvements that made a measurable difference
Expert-level knowledge of post-training methods including SFT, RLHF, RLAIF, DPO, GRPO, and related alignment and reasoning techniques
track record of applying them to real, production-grade systems
Strong command of Python and PyTorch (or JAX)
engineering depth to build and scale training pipelines, evaluation infrastructure, and data synthesis workflows
meaningful publication record at competitive ML/AI venues (NeurIPS, ICLR, ICML, ACL, EMNLP, or equivalent)
ability to originate ideas and execute on a multi-month research agenda independently
Hands-on experience designing or evaluating agentic systems
nuanced understanding of where they break and how to fix them
Ph.D. in Computer Science, Machine Learning, NLP, or a related field — or equivalent demonstrated research experience with a strong portfolio of independent, published work

Nice to have

Connect — you collaborate openly across research, engineering, and product and communicate complex ideas with clarity to both technical and non-technical audiences
Challenge — you ask the hard questions, push back on conventional wisdom, and pursue the research directions others haven't tried yet
Own — you drive your projects end-to-end with urgency, take accountability for results, and care deeply about the impact your work has on real customers

🚀 About WRITER

WRITER is where the world's leading enterprises orchestrate AI-powered work. Our vision is to expand human capacity through superintelligence. And we're proving it's possible – through powerful, trustworthy AI that unites IT and business teams together to unlock enterprise-wide transformation. With WRITER's end-to-end platform, hundreds of companies like Mars, Marriott, Uber, and Vanguard are building and deploying AI agents that are grounded in their company's data and fueled by WRITER's enterprise-grade LLMs. Valued at $1.9B and backed by industry-leading investors including Premji Invest, Radical Ventures, and ICONIQ Growth, WRITER is rapidly cementing its position as the leader in enterprise generative AI.

Founded in 2020 with office hubs in San Francisco, New York City, Austin, Chicago, and London, our team thinks big and moves fast, and we're looking for smart, hardworking builders and scalers to join us on our journey to create a better future of work with AI.

📐 About the role

AI research at WRITER isn't just about publishing papers — it's about building the scientific foundation that powers some of the most ambitious enterprise AI deployments in the world. As a staff AI research scientist, you'll be at the center of that work. You'll drive a high-impact research agenda focused on large language models, agentic reasoning, and the system-level capabilities that make AI genuinely useful at enterprise scale. This is a rare opportunity to do research that matters twice over — advancing the field and shipping directly into products used by hundreds of thousands of people every day.

We're at an inflection point. Enterprises are moving from experimenting with AI to deeply embedding it across their operations, and WRITER's models are the engine making that possible. The work you do here — on post-training, planning, multi-step reasoning, and agentic workflows — will directly shape how the next generation of enterprise AI behaves, performs, and scales. You'll have the resources, infrastructure, and cross-functional support to pursue ambitious ideas and bring them to life quickly.

This role is hybrid, based out of our San Francisco or New York City hub. You'll report to our VP of AI research.

🦸🏻‍♀️ What you'll do

Lead an independent, high-impact research agenda on large language models and agentic systems, owning projects from early hypothesis through model training, evaluation, and production deployment
Design and execute large-scale post-training experiments using supervised fine-tuning, reinforcement learning from human feedback (RLHF), RLAIF, DPO, and emerging alignment techniques — with a focus on improving multi-step reasoning, planning, and tool use in enterprise agentic workflows
Build novel evaluation benchmarks and methodologies that push beyond existing limitations, establishing rigorous measures for how well models perform on complex, real-world enterprise tasks
Develop scalable data synthesis and curation pipelines that generate the high-quality training signal driving model improvement — including LLM-as-judge frameworks, synthetic data generation, and adversarial dataset construction
Shape WRITER's model architecture and training roadmap by translating your research insights into concrete improvements to our enterprise-grade LLMs, working hand-in-hand with research engineering and product teams
Publish and present original research at top-tier venues — NeurIPS, ICLR, ICML, ACL, and others — representing WRITER at the frontier of the field and contributing to the broader scientific community
Mentor and uplevel fellow researchers and engineers on the team, helping set a high bar for scientific rigor, experimental design, and research culture

⭐️ What you need

7+ years of hands-on ML research experience, with deep expertise in large language model pre-training and post-training — you've trained models at scale, debugged distributed jobs, and shipped improvements that made a measurable difference
Expert-level knowledge of post-training methods including SFT, RLHF, RLAIF, DPO, GRPO, and related alignment and reasoning techniques, with a track record of applying them to real, production-grade systems
Strong command of Python and PyTorch (or JAX), with the engineering depth to build and scale training pipelines, evaluation infrastructure, and data synthesis workflows yourself — not just direct others to do it
A meaningful publication record at competitive ML/AI venues (NeurIPS, ICLR, ICML, ACL, EMNLP, or equivalent), evidencing your ability to originate ideas and execute on a multi-month research agenda independently
Hands-on experience designing or evaluating agentic systems — models that plan, reason through multi-step tasks, use tools, and recover gracefully from errors — with a nuanced understanding of where they break and how to fix them
A Ph.D. in Computer Science, Machine Learning, NLP, or a related field — or equivalent demonstrated research experience with a strong portfolio of independent, published work
The instincts and orientation that match WRITER's values: you Connect — you collaborate openly across research, engineering, and product and communicate complex ideas with clarity to both technical and non-technical audiences; you Challenge — you ask the hard questions, push back on conventional wisdom, and pursue the research directions others haven't tried yet; you Own — you drive your projects end-to-end with urgency, take accountability for results, and care deeply about the impact your work has on real customers

🍩 Benefits & perks (US Full-time employees)

Generous PTO, plus company holidays
Medical, dental, and vision coverage for you and your family
Paid parental leave for all parents (16 weeks)
Fertility and family planning support
Early-detection cancer testing through Galleri
Flexible spending account and dependent FSA options
Health savings account for eligible plans with company contribution
Annual work-life stipends for:
- Wellness stipend for gym, massage/chiropractor, personal training, etc.
- Learning and development stipend
Company-wide off-sites and team off-sites
Competitive compensation, company stock options and 401k

WRITER is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.

🚀 About WRITER

📐 About the role

This role is hybrid, based out of our San Francisco or New York City hub. You'll report to our VP of AI research.

🦸🏻‍♀️ What you'll do

Lead an independent, high-impact research agenda on large language models and agentic systems, owning projects from early hypothesis through model training, evaluation, and production deployment

Build novel evaluation benchmarks and methodologies that push beyond existing limitations, establishing rigorous measures for how well models perform on complex, real-world enterprise tasks

Publish and present original research at top-tier venues — NeurIPS, ICLR, ICML, ACL, and others — representing WRITER at the frontier of the field and contributing to the broader scientific community

Mentor and uplevel fellow researchers and engineers on the team, helping set a high bar for scientific rigor, experimental design, and research culture

⭐️ What you need

7+ years of hands-on ML research experience, with deep expertise in large language model pre-training and post-training — you've trained models at scale, debugged distributed jobs, and shipped improvements that made a measurable difference

Expert-level knowledge of post-training methods including SFT, RLHF, RLAIF, DPO, GRPO, and related alignment and reasoning techniques, with a track record of applying them to real, production-grade systems

Strong command of Python and PyTorch (or JAX), with the engineering depth to build and scale training pipelines, evaluation infrastructure, and data synthesis workflows yourself — not just direct others to do it

A meaningful publication record at competitive ML/AI venues (NeurIPS, ICLR, ICML, ACL, EMNLP, or equivalent), evidencing your ability to originate ideas and execute on a multi-month research agenda independently

Hands-on experience designing or evaluating agentic systems — models that plan, reason through multi-step tasks, use tools, and recover gracefully from errors — with a nuanced understanding of where they break and how to fix them

A Ph.D. in Computer Science, Machine Learning, NLP, or a related field — or equivalent demonstrated research experience with a strong portfolio of independent, published work

The instincts and orientation that match WRITER's values: you Connect — you collaborate openly across research, engineering, and product and communicate complex ideas with clarity to both technical and non-technical audiences; you Challenge — you ask the hard questions, push back on conventional wisdom, and pursue the research directions others haven't tried yet; you Own — you drive your projects end-to-end with urgency, take accountability for results, and care deeply about the impact your work has on real customers

🍩 Benefits & perks (US Full-time employees)

Generous PTO, plus company holidays

Medical, dental, and vision coverage for you and your family

Paid parental leave for all parents (16 weeks)

Fertility and family planning support

Early-detection cancer testing through Galleri

Flexible spending account and dependent FSA options

Health savings account for eligible plans with company contribution

Annual work-life stipends for:

Wellness stipend for gym, massage/chiropractor, personal training, etc.
Learning and development stipend

Company-wide off-sites and team off-sites

Competitive compensation, company stock options and 401k

By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.

Staff AI Research Scientist

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals

🚀 About WRITER

📐 About the role

🦸🏻‍♀️ What you'll do

⭐️ What you need

🚀 About WRITER

📐 About the role

🦸🏻‍♀️ What you'll do

⭐️ What you need