Machine Learning Engineering Manager - Evaluations

Canva Canva · Enterprise · Vienna, VIENNA, Austria · Information Technology

Canva is seeking a Machine Learning Engineering Manager to lead a team of Research Scientists and ML Engineers focused on building production-ready evaluation systems for AI-powered design features. The role involves coaching the team, setting technical strategy, and ensuring the deployment of robust, scalable ML systems, with a strong emphasis on visual models and evaluating aesthetic judgment.

What you'd actually do

  1. Coaching and mentoring a high-performing team of Machine Learning Engineers and Research Scientists.
  2. Owning the evaluation infrastructure - Design, build, and maintain robust evaluation systems, quality metrics, safety monitoring, red-teaming, competitive benchmarking - to guarantee enterprise readiness and user delight at scale.
  3. Building automated metrics that reliably predict human aesthetic judgment across dimensions like visual hierarchy, layout coherence, typography, and brand alignment.
  4. Advising on human evaluation pipelines and closing the loop between user signals and model improvements.
  5. Setting technical strategy in alignment with Canva's AI and product goals.

Skills

Required

  • led machine learning engineering teams
  • coaching and delivering production systems
  • deploying and scaling generative models (Diffusion, GANs, VAEs, LLMs) in production environments
  • visual models (image, video, design)
  • building ML infrastructure, evaluation pipelines, and monitoring systems at scale
  • creating data-driven evaluation methodologies
  • systems design skills
  • MLOps
  • model serving
  • production reliability
  • communicating clearly with technical and non-technical audiences

Nice to have

  • experience with visual quality assessment, aesthetic modelling, or human preference learning
  • tackled the gap between automated metrics and human raters
  • Understand design principles (hierarchy, balance, typography, colour theory) well enough to operationalise them as measurable signals
  • Stay current with both SOTA research trends and engineering best practices

What the JD emphasized

  • production-ready evaluation systems
  • enterprise readiness
  • human aesthetic judgment
  • visual quality assessment
  • aesthetic modelling
  • human preference learning
  • automated metrics and human raters

Other signals

  • leading ML teams
  • building production-ready evaluation systems
  • turning cutting-edge ML capabilities into delightful product experiences
  • scaling generative models in production
  • visual models