Senior Agentic AI Research Scientist

Axon Axon · Enterprise · Office, WA · 2014 Artificial Intelligence

Senior AI Research Scientist focused on agentic video and multimodal reasoning systems, developing AI solutions for enterprise and public safety. Responsibilities include advancing state-of-the-art ML and multimodal tech, creating algorithms for perception and understanding of visual/multimodal data, and building responsible AI capabilities. The role involves leading research in AI-assisted universal search over multimodal data, advancing computer vision and gen-AI techniques, designing scalable models for inference, developing metrics for agentic systems, and translating research into products. Requires a PhD, experience in multimodal ML, LLMs, video understanding, agentic reasoning, RAG, and production impact.

What you'd actually do

  1. Collaborate with other scientists, engineers and product managers to build proof-of-concepts to shape the Axon of tomorrow.
  2. Lead end-to-end research efforts in AI-assisted universal search of and reasoning over large multimodal data with an emphasis on video understanding.
  3. Advance related computer vision, machine learning and gen-AI techniques for cloud and devices from multimodal data sources, including scene understanding, action recognition and anomaly detection.
  4. Design and implement responsible, privacy-preserving, efficient and scalable models for inference and analysis of visual data.
  5. Develop performance and quality metrics for agentic multimodal reasoning models and systems, and validate their effectiveness in real-world settings.

Skills

Required

  • Python
  • PyTorch
  • TensorFlow
  • Keras
  • multimodal machine learning
  • large language models
  • video understanding
  • information retrieval
  • agentic reasoning
  • computer vision
  • temporal modeling
  • applied research
  • production impact
  • ML development lifecycle
  • RAG architectures
  • embeddings
  • indexing
  • vector search
  • distributed training
  • distributed inference
  • evaluation
  • ranking optimization
  • tool use
  • grounding LLMs

Nice to have

  • academic publications
  • technical documentation
  • patent disclosures
  • coaching junior scientists

What the JD emphasized

  • PhD and 3+ years of experience in Computer Science or a related field with a focus on multimodal machine learning, large language models, video understanding, information retrieval, agentic reasoning or related technical fields.
  • Proven track record of applied research and/or production impact in multimodal learning, video-language modeling, retrieval systems, foundation models, agentic systems, or RAG architectures.
  • Experience owning and driving the ML development lifecycle from problem definition and data strategy through model development, evaluation, deployment, and iteration in production environments.
  • Experience contributing to large-scale ML and multimodal RAG systems including embeddings, indexing and vector search at scale, distributed training or inference workflows, evaluation and ranking optimization, agentic tool use, and grounding LLMs with external knowledge.
  • Strong understanding of video analysis and temporal modeling as well as computer vision fundamentals and trade-offs.

Other signals

  • agentic video and multimodal reasoning systems
  • AI solutions that transform both the enterprise and public safety domains
  • advance the state-of-the-art in machine learning and multimodal technology
  • develop cutting-edge algorithms and solutions that enable intelligent perception and understanding of visual and multimodal data
  • AI-assisted universal search of and reasoning over large multimodal data with an emphasis on video understanding
  • advance related computer vision, machine learning and gen-AI techniques for cloud and devices from multimodal data sources, including scene understanding, action recognition and anomaly detection
  • Develop performance and quality metrics for agentic multimodal reasoning models and systems
  • Stay up-to-date with the latest research and advances in agentic AI, machine learning and computer vision and translate relevant findings into shipping Axon products
  • Proven track record of applied research and/or production impact in multimodal learning, video-language modeling, retrieval systems, foundation models, agentic systems, or RAG architectures
  • Experience owning and driving the ML development lifecycle from problem definition and data strategy through model development, evaluation, deployment, and iteration in production environments
  • Experience contributing to large-scale ML and multimodal RAG systems including embeddings, indexing and vector search at scale, distributed training or inference workflows, evaluation and ranking optimization, agentic tool use, and grounding LLMs with external knowledge
  • Strong understanding of video analysis and temporal modeling as well as computer vision fundamentals and trade-offs