New Model Behaviour Architect, Function Calling

Mistral AI Mistral AI · AI Frontier · London, United Kingdom · Research

Mistral AI is hiring a Model Behaviour Architect focused on Function Calling to define and measure how LLMs interact with tools, invoke functions, and orchestrate complex workflows. The role involves improving model behavior for accurate and reliable tool use, collaborating with scientists, and developing evaluation pipelines.

What you'd actually do

  1. Interact with models to identify opportunities for improving function calling and tool use behaviour.
  2. Gather internal and external feedback to scope and prioritise areas for enhancement.
  3. Design and implement evaluations, data guidelines, and synthetic tool environments and APIs.
  4. Address edge cases such as malformed arguments, hallucinated functions, and incorrect tool selection through rigorous testing.
  5. Develop robust evaluation pipelines to assess the function-calling capabilities of model candidates.

Skills

Required

  • Deep understanding of API design, structured outputs, and schema specification (e.g., JSON Schema), or expertise in engineering and code behaviour, or experience with LLM agents, including reasoning, planning, and multi-step tool use.
  • Prior knowledge in training and optimising model behaviour for real-world applications.
  • Expertise in building robust, scalable evaluation frameworks for AI systems.
  • Ability to thrive in dynamic, technically complex environments and deliver innovative solutions.
  • Track record of solving open-ended challenges with creative, out-of-the-box approaches.

What the JD emphasized

  • function calling
  • tool use
  • orchestrate complex workflows
  • accurate, reliable, and intelligent tool use
  • multi-step orchestration
  • error recovery
  • model behaviour
  • real-world problems
  • API design
  • structured outputs
  • schema specification
  • LLM agents
  • reasoning
  • planning
  • multi-step tool use
  • training and optimising model behaviour
  • real-world applications
  • robust, scalable evaluation frameworks
  • technically complex environments
  • open-ended challenges
  • creative, out-of-the-box approaches

Other signals

  • LLM Tool Use
  • Function Calling
  • Agent Orchestration
  • Evaluation Frameworks