Member of Technical Staff - Agent Dx Research

Modal Modal · Data AI · New York, NY · Engineering

Research role focused on building an evaluation framework for AI coding agents to improve developer experience on the Modal platform. This involves defining quantitative objectives, measuring performance, and translating insights into product improvements, while staying updated on agent advancements and customer use cases.

What you'd actually do

  1. build out a framework and process for agent productivity evaluation
  2. defining quantitative objectives
  3. designing systems to measure performance
  4. translating results into product improvements
  5. stay on top of new developments in tools and workflows and to work with our customers to understand how they’re using coding agents with Modal and where we can be providing more value

Skills

Required

  • design and implement scalable agent benchmarking workflows
  • experience with experimental design, measurement, and statistical evaluation
  • up-to-date knowledge of the latest advances in coding agents
  • interest in developer tooling and opinions about developer ergonomics
  • familiarity with the use cases that Modal serves (generative AI inference, large-scale batch jobs, multi-node training, etc.)
  • strong communication skills and the ability to convey research insights to decision makers

Nice to have

  • PhD in Computer Science, Human Computer Interaction, Cognitive Science, Operations Research, or other related field
  • prior experience working as a Machine Learning Scientist, Quantitive UX Researcher, or other similar role on a product team

What the JD emphasized

  • rigorous evaluation
  • quantitative objectives
  • measure performance
  • agent productivity evaluation
  • coding agents

Other signals

  • AI agents
  • developer experience
  • evaluation framework
  • quantitative objectives
  • product improvements