Senior Principal Machine Learning Engineer, ML Platform and Systems Architecture

Autodesk Autodesk · Enterprise · Boston, MA +18 · Remote

Senior Principal ML Engineer to define and drive the technical strategy for large-scale machine learning platforms and systems at Autodesk. This role involves shaping multi-year architecture, influencing engineering standards, and leading major platform initiatives across training infrastructure, data platforms, evaluation systems, model serving, and operational excellence for production ML.

What you'd actually do

  1. Define and lead technical strategy for a domain or large-scale platform supporting machine learning systems
  2. Drive architecture decisions across teams for scalable training, data, evaluation, deployment, observability, and reliability systems
  3. Lead multi-team initiatives with far-reaching technical impact across a function, platform, or division
  4. Define technical direction for data pipelines that support large-scale structured and semi-structured technical datasets
  5. Set standards for data lineage, provenance, governance, and responsible data usage in ML systems

Skills

Required

  • ML platform architecture
  • distributed systems
  • software architecture
  • platform engineering
  • ML infrastructure
  • cloud-native architectures
  • production engineering practices
  • large-scale system design
  • technical strategy
  • cross-team technical direction
  • communication

Nice to have

  • data lineage
  • provenance
  • governance
  • responsible data usage
  • Ray
  • Airflow
  • Spark
  • model deployment
  • inference services
  • monitoring
  • observability
  • geometry
  • graph
  • hierarchical data
  • multimodal data
  • foundation model infrastructure
  • high-throughput data systems
  • resiliency
  • service reviews
  • fire drills
  • risk reduction
  • AEC
  • design technology
  • BIM/CAD ecosystems
  • Autodesk products

What the JD emphasized

  • large-scale machine learning platforms and systems
  • multi-year architecture
  • engineering standards
  • platform initiatives
  • training infrastructure
  • data platforms
  • evaluation and experimentation systems
  • model serving frameworks
  • operational excellence for production ML
  • large-scale platform outcomes
  • large-scale system design
  • foundation model infrastructure

Other signals

  • ML platform and systems architecture
  • training infrastructure
  • data platforms
  • evaluation and experimentation systems
  • model serving frameworks
  • operational excellence for production ML