Senior Staff Machine Learning Engineer

Robinhood Robinhood · Fintech · Bellevue, WA +1 · ENG Data and AI Platform Division

Senior Staff Machine Learning Engineer at Robinhood focused on defining and upholding the quality bar for ML systems across the organization. This role involves designing evaluation frameworks, guiding model selection, and partnering with product, data science, and engineering teams to ensure systems meet clear standards for correctness, safety, latency, and user satisfaction. The work will shape how ML models are built, evaluated, and improved across Robinhood.

What you'd actually do

  1. Build and evaluate self-built, frontier and fine-tuned models across quality, latency, cost, and edge cases to determine appropriate use cases
  2. Partner with product managers, data scientists, and engineers to translate evaluation results into clear launch criteria for AI systems
  3. Analyze production issues, identify root causes, and prioritize improvements to increase system reliability and performance
  4. Build visibility into model performance through metrics, monitoring, and reporting that inform roadmap decisions

Skills

Required

  • Experience building complex and impactful production ML models and systems
  • Understanding tradeoffs in performance, cost, and latency
  • Deep experience defining and measuring quality for machine learning systems using evaluation frameworks, datasets, and scorecards
  • Demonstrated ability to analyze production issues and lead initiatives that improve system quality across multiple teams
  • Comfortable working with engineers, data scientists, and product partners to deliver measurable improvements in system performance

Nice to have

  • Experience with traditional ML models
  • Experience with deep learning models, LLMs and Transformer models
  • Experience building or operating systems in regulated environments
  • Experience working with AI evaluation and observability tools

What the JD emphasized

  • defining and measuring quality for machine learning systems using evaluation frameworks, datasets, and scorecards
  • analyze production issues and lead initiatives that improve system quality across multiple teams
  • experience building or operating systems in regulated environments

Other signals

  • Defining and upholding the quality bar for ML systems
  • Designing evaluation frameworks
  • Guiding model selection
  • Ensuring systems meet clear standards for correctness, safety, latency, and user satisfaction
  • Building visibility into model performance through metrics, monitoring, and reporting