Senior Machine Learning Engineer, Agentic

Robinhood Robinhood · Fintech · Bellevue, WA +1 · ENG Data and AI Platform Division

Robinhood is seeking a Senior Machine Learning Engineer for their Agentic team to build and ship production AI agents for financial products. The role involves developing evaluation harnesses, feedback pipelines, implementing optimization techniques (DPO, PPO, reward modeling), launching fine-tuned models in production, and collaborating with research teams on agentic reasoning, planning, and tool use.

What you'd actually do

  1. Translate product goals into measurable metrics and SLOs, and build a rigorous evaluation harness to continuously score agents performance
  2. Develop feedback and optimization pipelines that uses both automated metrics and human-in-the-loop evaluation signals to improve agent behavior over time
  3. Implement and scale optimization techniques such as Direct Preference Optimization (DPO), Proximal Policy Optimization (PPO), and reward modeling to improve agent performance.
  4. Launch and support fine-tuned models in production environments with robust evaluation, rollback strategies, and performance monitoring.
  5. Collaborate closely with applied AI/ML teams to translate state-of-the-art research in agentic reasoning, planning, and tool use into reliable, production-ready systems

Skills

Required

  • Strong technical expertise in software development
  • understanding of agentic workflows—including reasoning loops, tool invocation, memory, and orchestration of autonomous AI agents
  • Hands-on experience using Large Language Models, including prompt engineering, fine-tuning, model distillation, and deploying optimized models (e.g. via DPO, PPO) into production environments
  • Leadership and mentorship capabilities
  • Excellent communication and collaboration skills

Nice to have

  • Innovation mindset and commitment to continuous learning and a bias toward action, staying at the forefront of ML/AI trends, agentic systems research, and best practices in tooling, safety, and evaluation.

What the JD emphasized

  • production AI agents
  • high-performance AI agents
  • production-grade infrastructure
  • strong evaluation and observability
  • continuous optimization
  • agentic reasoning, planning, and tool use

Other signals

  • production AI agents
  • high-performance AI agents
  • production-grade infrastructure
  • strong evaluation and observability
  • continuous optimization