Research Scientist

at Cursor · Coding AI · San Francisco, CA · Engineering

Research Scientist at Cursor focused on automating coding by training frontier coding agents. The role involves driving RL or mid-training research, owning ambiguous research problems end-to-end, and pushing results into the next model. Key responsibilities include improving RL understanding for longer tasks with less compute, training graders for coding tasks with non-verifiable rewards, improving training data quality, and researching real-time RL for coding agents. Requires a deep background in RL, strong ML fundamentals, excellent programming skills, and the ability to handle ambiguous research tasks with autonomy.

What you'd actually do

  1. Improve our understanding of RL, what it takes to handle longer horizon tasks, and train with less compute
  2. Train graders to improve performance on coding tasks with non-verifiable reward
  3. Improve the quality and difficulty of datapoints we use for training our models
  4. Realtime RL for coding agents

Skills

Required

  • deep background in RL
  • strong machine learning fundamentals
  • excellent programmer and software engineer
  • ability to handle ambiguous research tasks with little guidance
  • care about data quality
  • dive into the data when appropriate
  • truth seeking

Nice to have

  • creative
  • passionate

What the JD emphasized

  • drive effective RL or mid-training research
  • own ambiguous, hard research problems end-to-end
  • forming hypotheses, designing experiments, building the training/eval/data needed to test them, and pushing results into the next model
  • significantly more scope and autonomy
  • handle ambiguous research tasks with little guidance

Other signals

  • frontier coding agents
  • RL
  • real user data
  • research scientists
  • ambiguous, hard research problems end-to-end
  • forming hypotheses
  • designing experiments
  • building the training/eval/data needed to test them
  • pushing results into the next model
  • scope and autonomy
  • RL
  • longer horizon tasks
  • less compute
  • Train graders
  • coding tasks with non-verifiable reward
  • data quality
  • Realtime RL
  • coding agents
  • deep background in RL
  • strong machine learning fundamentals
  • excellent programmer and software engineer
  • ambiguous research tasks with little guidance
  • data quality
  • truth seeking
Read full job description

Our mission is to automate coding. The first step in our journey is to build the best tool for professional programmers, using a combination of inventive research, design, and engineering. Our organization is very flat, and our team is small and talent dense. We particularly like people who are truth-seeking, passionate, and creative. We enjoy spirited debate, crazy ideas, and shipping code.

Research Scientist

Cursor is building the future of coding. We train frontier coding agents and scale RL on real user data to make them increasingly effective.

About the role

We’re looking for Research Scientists who can drive effective RL or mid-training research in a small-team setting. You’ll own ambiguous, hard research problems end-to-end: forming hypotheses, designing experiments, building the training/eval/data needed to test them, and pushing results into the next model. You should expect significantly more scope and autonomy than in other research labs.

What you’ll do

  • Improve our understanding of RL, what it takes to handle longer horizon tasks, and train with less compute
  • Train graders to improve performance on coding tasks with non-verifiable reward
  • Improve the quality and difficulty of datapoints we use for training our models
  • Realtime RL for coding agents

You may be a fit if

  • You have a deep background in RL and strong machine learning fundamentals
  • You’re an excellent programmer and software engineer
  • You can handle ambiguous research tasks with little guidance
  • You care a lot about data quality, and can dive into the data when appropriate
  • You are truth seeking, aiming to learn more about the science than proving your ideas are correct.

#LI-DNI