Principal Machine Learning Engineer, Evaluation

Autodesk Autodesk · Enterprise · Toronto, ON +2

Principal Machine Learning Engineer focused on evaluating ML models for CAD and BIM products. The role involves designing evaluation datasets, metrics, and protocols, building repeatable evaluation tooling, analyzing data to assess model behavior and customer outcomes, validating ML-powered product experiences, and communicating findings to stakeholders. The goal is to ensure ML-powered experiences meet high standards for quality, reliability, and customer impact.

What you'd actually do

  1. Design and run evaluation protocols (datasets, metrics, statistical analysis) that reflect real CAD/BIM user workflows and production conditions
  2. Build repeatable evaluation tooling and automation to support model development, regression testing, and release readiness decisions
  3. Curate, process, and analyze data from multiple sources (including production derived samples) to assess model behavior and customer outcomes
  4. Validate end-to-end ML powered product experiences and translate product requirements into measurable evaluation criteria
  5. Communicate findings and recommendations clearly to researchers, engineers, product teams, and leadership

Skills

Required

  • BS or MS in Mechanical Engineering, Architecture, Computer Engineering, Computer Science, Applied Math, Statistics, or equivalent industry experience
  • 4+ years of professional experience in ML model evaluation, ML enabled QA, or applied ML engineering, including designing evaluation datasets, metrics, and protocols
  • Strong software engineering skills for building repeatable evaluation systems, including Python, data pipelines and analysis, testable and maintainable code, version control, and cloud based workflows (for example AWS or Azure)
  • Strong written communication skills for documenting evaluation methods, results, and recommendations

Nice to have

  • Familiarity with design, manufacturing, or AEC workflows, including hands-on experience with CAD/BIM tools such as Fusion, AutoCAD, or Revit
  • Experience with geometry or design data representations, including 2D and 3D
  • Familiarity with ML frameworks and tooling (for example PyTorch, Ray, or similar)

What the JD emphasized

  • ML model evaluation
  • evaluation datasets, metrics, and protocols
  • quality, reliability, and customer impact
  • real user workflows
  • production conditions
  • release readiness
  • customer outcomes
  • model quality issues
  • accuracy, robustness, latency, failure modes

Other signals

  • ML model evaluation
  • quality, reliability, and customer impact
  • design evaluation datasets, metrics, and protocols
  • guide release readiness
  • continuous improvement in production
  • communicate results
  • partnering with researchers, engineers, and product teams