Staff Research Engineer, Data Agents

Databricks Databricks · Data AI · San Francisco, CA · Engineering - Pipeline

Research Engineer role focused on developing frontier enterprise data agents capable of autonomous planning, code generation, and multi-step workflow execution. The role involves post-training enhancements, harness design, agentic reinforcement learning, and shipping improvements to the Genie agent product.

What you'd actually do

  1. Develop the best post-training recipes to train enterprise Data agents, that are capable of autonomous planning, code generation, and multi-step workflow execution within intricate enterprise settings.
  2. Partner closely with product teams to turn prototypes and research ideas into the best agentic experience for Databricks users.
  3. Build systems that help the agent discover and use relevant lakehouse context, including tables, notebooks, code, and cell outputs, to produce more accurate and useful results.
  4. Raise the technical bar for the team through strong design, execution, debugging, and mentorship, helping shape the long-term direction of agentic experiences at Databricks.

Skills

Required

  • LLMs
  • agents
  • reinforcement learning
  • post-training workflows
  • applied research environment
  • shipping research prototypes to production

Nice to have

  • BS, MS, or PhD in Computer Science or a related field
  • Clear communication and strong cross-functional collaboration with researchers, engineers, and product stakeholders

What the JD emphasized

  • track record of shipping research prototypes to production

Other signals

  • Developing frontier enterprise data agents
  • Autonomous planning, code generation, multi-step workflow execution
  • Shipping direct improvements to Genie, Databricks' agent product