Senior Machine Learning Engineer - Agent Tools Interop (au Remote)

Canva Canva · Enterprise · Sydney, Australia · Information Technology

Senior Machine Learning Engineer focused on agent tool interoperability, ensuring reliable, safe, and scalable tool calling for agents within Canva's ecosystem. The role involves designing systems for tool discovery, invocation, and execution, building evaluation pipelines, and collaborating with GenAI and platform teams to integrate agentic capabilities.

What you'd actually do

  1. Build and evolve the systems that enable agents to discover, invoke, and safely execute capabilities across Canva at scale, from initial foundations through to long-term platform maturity.
  2. Design tool schemas and definition patterns that maximize LLM tool selection accuracy and reliable invocation across diverse agent consumers and AI integrations.
  3. Build and operate evaluation pipelines that measure tool calling behavior in production, catch regressions, and drive continuous quality improvement.
  4. Collaborate with product, platform, and GenAI teams to integrate agentic capabilities into production systems and understand how tool use behaves at real-world scale.
  5. Advise contributing teams on how to define tools agents can reliably call, lowering the bar for onboarding new capabilities into the shared agentic layer.

Skills

Required

  • LLM tool-use
  • function calling
  • designing tool schemas
  • shipping agentic integrations
  • building evaluation frameworks
  • Java

Nice to have

  • Python
  • TypeScript
  • MCP
  • LangChain
  • LangGraph
  • agent frameworks
  • prompt engineering for tool definitions
  • prompt engineering for tool calling schemas

What the JD emphasized

  • hands-on production experience with LLM tool-use and function calling
  • built evaluation frameworks that measure AI feature quality systematically
  • Java proficiency is essential
  • experience at the boundary of ML and platform engineering
  • make AI integrations production-grade, safe, and scalable

Other signals

  • agentic AI
  • tool calling
  • evaluation infrastructure
  • interoperability layer
  • production-grade