Senior Machine Learning Engineer - Agent Tools Interop (au Remote)

Canva Canva · Enterprise · Brisbane, QLD, Australia · Information Technology

Senior Machine Learning Engineer focused on building and scaling agentic AI systems, specifically ensuring reliable, safe, and scalable tool calling across a large ecosystem of agents. The role involves designing tool schemas, building evaluation pipelines, and collaborating with GenAI and platform teams to create a foundational interoperability layer for AI integrations.

What you'd actually do

  1. Build and evolve the systems that enable agents to discover, invoke, and safely execute capabilities across Canva at scale, from initial foundations through to long-term platform maturity.
  2. Design tool schemas and definition patterns that maximize LLM tool selection accuracy and reliable invocation across diverse agent consumers and AI integrations.
  3. Build and operate evaluation pipelines that measure tool calling behavior in production, catch regressions, and drive continuous quality improvement.
  4. Collaborate with product, platform, and GenAI teams to integrate agentic capabilities into production systems and understand how tool use behaves at real-world scale.
  5. Advise contributing teams on how to define tools agents can reliably call, lowering the bar for onboarding new capabilities into the shared agentic layer.

Skills

Required

  • LLM tool-use
  • function calling
  • designing tool schemas
  • shipping agentic integrations
  • building evaluation frameworks
  • Java

Nice to have

  • Python
  • TypeScript
  • MCP
  • LangChain
  • LangGraph
  • prompt engineering for tool definitions
  • prompt engineering for tool calling schemas

What the JD emphasized

  • hands-on production experience with LLM tool-use and function calling
  • built evaluation frameworks that measure AI feature quality systematically
  • Java proficiency is essential
  • experience at the boundary of ML and platform engineering
  • research-only backgrounds, traditional ML without LLM/GenAI exposure, or data engineering experience presented as ML engineering.

Other signals

  • agentic AI
  • tool calling
  • evaluation infrastructure
  • interoperability layer
  • production-grade