Senior Machine Learning Engineer - Multimodal Data

Canva Canva · Enterprise · Vienna, Vienna, Austria · Information Technology

Canva is seeking a Senior Machine Learning Engineer to focus on the data foundations for their multimodal agent research. This role involves building and maintaining data pipelines, datasets, and tooling for training and evaluation loops, with a focus on text, image, and multimodal sources. The engineer will collaborate with researchers to translate requirements into data specifications, create evaluation datasets, and develop tooling for data construction, including human annotation and synthetic data generation. The role also emphasizes data quality, documentation, and implementing ML DevOps practices.

What you'd actually do

  1. Design and build data pipelines for agent training: collection, filtering, deduplication, formatting, and versioning across text, image, and multimodal sources.
  2. Build and maintain infrastructure for efficient data loading, storage, and retrieval at scale (S3, distributed systems, streaming pipelines).
  3. Collaborate with research scientists to translate research requirements into concrete data specifications, and iterate as experiments reveal new needs.
  4. Create evaluation datasets and benchmarks in collaboration with researchers—curating task distributions that surface real failure modes.
  5. Develop tooling for dataset construction—including human annotation workflows, synthetic data generation, and preference data collection for RLHF/DPO-style training.

Skills

Required

  • Strong software engineering skills in Python
  • experience building production-grade data pipelines
  • ML DevOps
  • prompt engineering
  • ML data workflows
  • large-scale data processing and loading (Ray, or similar)
  • data versioning
  • format considerations for training (tokenization, batching, sharding)
  • data pipelines for large-scale distributed ML training runs
  • annotation tooling and human-in-the-loop data collection (Label Studio or internal systems)
  • Understanding of ML training requirements
  • Experience loading and writing large datasets to/from cloud infrastructure (AWS) and distributed storage systems
  • Strong communication skills
  • collaborative approach

Nice to have

  • Experience with preference data collection for RLHF or reward modelling
  • Familiarity with multimodal data (image-text pairs, video, design assets)
  • Experience building synthetic data generation pipelines using LLMs
  • Background in data quality metrics and monitoring systems
  • Contributions to dataset releases or benchmarks in the ML community

What the JD emphasized

  • multimodal agent research
  • data foundations
  • training pipelines
  • datasets
  • tooling
  • agent training
  • research requirements
  • evaluation datasets
  • dataset construction
  • data quality
  • ML training requirements
  • large-scale distributed ML training runs

Other signals

  • multimodal agent research
  • data foundations
  • training pipelines
  • datasets
  • tooling