What you'd actually do

Train and deploy Language Models adapted to specific industry needs.

Create and adapt novel training and fine-tuning algorithms for language models with special focus on reinforcement learning reinforcement learning for long-horizon dynamic workflow.

Research innovation and scholarly dissemination: conceive and execute research projects that advance training methodologies; write and submit peer‑reviewed papers or preprints; and present work at conferences.

Drive end‑to‑end translation of research into product capabilities, leading projects from ideation and prototyping through production integration and measurable customer impact.

Skills

Required

Python
PyTorch
training/fine tuning AI/ML models
Generative AI pipelines
RAG (Retrieval augmented generation)

Nice to have

multi-agent training in dynamic harness
training or contributing to the development of very large-scale language models (e.g., 100B+ to trillion-parameter models)
distributed training
async RL
long sequence handling

Overview

Frontier Tuning aims to fine-tune frontier LLMs (large language model) on enterprise data, enabling task-specific agents and solutions. We are a small, nimble team that is advancing the state of the art of models in M365 Copilot. Come join our team and help transform the LLM experience in the enterprise.

Role Summary

We are seeking an **Applied Scientist II **with stellar research and system building skills and the desire to pursue the cutting edge in model development that pushes technological boundaries. We are looking for candidates with interest and experience in LLM post-training, especially reinforcement learning for long-horizon dynamic workflow.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities

Train and deploy Language Models adapted to specific industry needs.
Create and adapt novel training and fine-tuning algorithms for language models with special focus on reinforcement learning reinforcement learning for long-horizon dynamic workflow.
Research innovation and scholarly dissemination: conceive and execute research projects that advance training methodologies; write and submit peer‑reviewed papers or preprints; and present work at conferences.
Drive end‑to‑end translation of research into product capabilities, leading projects from ideation and prototyping through production integration and measurable customer impact.

Qualifications

Required/Minimum Qualifications:

Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 2+ years related experience (e.g., statistics, predictive analytics, research).
- OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 1+ year(s) related experience (e.g., statistics, predictive analytics, research).
- OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field.
- OR equivalent experience.
1+ years of experience training/fine tuning AI/ML models, preferably large language models.
1+ years of experience building Generative AI pipelines, e.g. with RAG (Retrieval augmented generation).
1+ years of experience with Python and/or PyTorch.

**Other Requirements: **Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

Experience with multi-agent training in dynamic harness.
Experience training or contributing to the development of very large-scale language models (e.g., 100B+ to trillion-parameter models), including distributed training, async RL, and long sequence handling.

#post-training

#RL

#agent

#harness

#M365CORE

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about **requesting accommodations.**