Applied Scientist, Fintelligence

Amazon Amazon · Big Tech · Bellevue, WA · Research Science

This role focuses on building and scaling generative AI applications for finance teams, involving the development of autonomous agents, efficient inference systems, and robust evaluation frameworks. The position emphasizes shipping production-ready AI systems that handle sensitive financial data and require high precision and reliability.

What you'd actually do

  1. Building AI systems that finance teams trust enough to rely on without manual review, where precision isn't a nice-to-have, it's a compliance requirement
  2. Designing agents that learn from user corrections and get measurably better with every interaction, not just at the next model release
  3. Solving inference at massive scale using tiered model architectures, intelligent routing, and small language models that deliver production-grade accuracy at a fraction of frontier model cost
  4. Developing evaluation frameworks that catch quality regressions before customers do and gate every model change before it ships

Skills

Required

  • PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
  • Experience in building machine learning models for business application
  • Experience programming in Java, C++, Python or related language
  • 3+ years of building models for business application experience

Nice to have

  • PhD in computer science, machine learning, engineering, or related fields
  • Experience in building speech recognition, machine translation and natural language processing systems (e.g., commercial speech products or government speech projects)
  • Experience in patents or publications at top-tier peer-reviewed conferences or journals

What the JD emphasized

  • compliance requirement
  • evaluation frameworks
  • finance professionals
  • financial data is messy, regulated, high-stakes

Other signals

  • building AI systems that finance teams trust enough to rely on without manual review
  • designing agents that learn from user corrections and get measurably better with every interaction
  • solving inference at massive scale using tiered model architectures, intelligent routing, and small language models
  • developing evaluation frameworks that catch quality regressions before customers do and gate every model change before it ships