Senior Machine Learning Operations Developer, Inference, Ai/ml Platform

Autodesk Autodesk · Enterprise · Toronto, ON +5

Autodesk is looking for a Senior MLOps Developer to join their AI/ML Platform team. The role focuses on operationalizing machine learning models, optimizing MLOps practices, designing automated deployment pipelines, and maintaining scalable infrastructure for model training and inference. The position requires strong experience in DevOps, MLOps, containerization (Docker, Kubernetes), CI/CD, and scripting (Python, Bash).

What you'd actually do

  1. Drive the operational excellence of our AI/ML Platform by implementing and optimizing MLOps practices
  2. Design and implement automated deployment pipelines for machine learning models, ensuring seamless transitions from development to production
  3. Collaborate with cross-functional teams to design, implement, and maintain scalable infrastructure for model training, inference, and data processing
  4. Develop and maintain robust monitoring and logging systems to track model performance, system health, and overall platform efficiency
  5. Work closely with data developers to ensure efficient data pipelines for model training and validation

Skills

Required

  • BS or MS in Computer Science, or related field
  • 5+ years of hands-on experience in DevOps and MLOps
  • deploying and managing machine learning models in production environments
  • Infrastructure as Code practices (Terraform or Ansible)
  • containerization technologies (Docker, Kubernetes)
  • Continuous Integration and Continuous Deployment (CI/CD) pipelines for machine learning projects
  • scripting skills in Python, Bash, or similar languages
  • monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack)
  • security best practices in MLOps
  • data encryption, access controls, and compliance standards
  • collaboration and communication skills

Nice to have

  • Experience with cloud platforms (AWS or Azure)
  • databases and data storage solutions (SQL, NoSQL, or data lakes)
  • machine learning frameworks (TensorFlow, PyTorch)
  • Git for version control
  • Jira for project management
  • Agile development methodologies

What the JD emphasized

  • deploying and managing machine learning models in production environments
  • scalable infrastructure for model training, inference, and data processing
  • monitoring and logging systems
  • security best practices in MLOps
  • compliance standards

Other signals

  • MLOps
  • Inference
  • AI/ML Platform
  • Deployment Pipelines
  • Scalable Infrastructure