Senior Machine Learning Engineer, Mlops West Coast

Autodesk Autodesk · Enterprise · San Francisco, CA +4

Senior Machine Learning Engineer focused on MLOps for CAD and BIM, responsible for building and operating infrastructure to deploy, monitor, and integrate AI models into production across Autodesk products. This role ensures reliability, scalability, and operational excellence for AI-powered experiences, partnering with researchers and product teams.

What you'd actually do

  1. Automate model testing and validation. Implement and operate CI/CD pipelines to enable safe, repeatable deployments and rollbacks.
  2. Provision and manage backend resources for inference (compute, containers, scaling), and tune performance, reliability, and cost in production.
  3. Define and continuously monitor health and performance metrics for deployed services. Triage issues by severity and drive timely resolution, including incident response and runbooks.
  4. Own end-to-end REST API integration, connecting backend model services to product and platform surfaces through scalable, containerized services.
  5. Work with researchers, evaluation engineers, product managers, and partner engineering teams to deliver production-ready solutions, communicate status and risks, and escalate when needed.

Skills

Required

  • BS or MS in Computer Science, Computer Engineering, or equivalent industry experience.
  • 3+ years of professional software engineering experience building and operating production services.
  • Experience automating testing and deployments using CI/CD, including release workflows that support safe rollouts and rollbacks.
  • Experience building and operating cloud hosted, containerized services (for example Docker and Kubernetes or similar), including provisioning resources and scaling inference workloads.
  • Experience building REST APIs using Python based frameworks (or similar), and integrating backend services with product or platform consumers.
  • Strong software engineering fundamentals: version control, code quality, and writing maintainable, testable software.
  • Strong written communication skills to document architectures, runbooks, and operational processes.

Nice to have

  • Experience running production ML or LLM inference services, including performance tuning, cost optimization, and capacity planning.
  • Experience with observability tooling and practices (metrics, logging, tracing, alerting) and incident response in an on-call environment.
  • Experience deploying services within an enterprise internal platform environment with standardized pipelines, security controls, and compliance requirements.
  • Familiarity with rate limiting, authentication and authorization, and API security best practices.
  • Familiarity with design, manufacturing, or AEC workflows, and how backend services integrate into CAD/BIM product experiences.
  • Familiarity with Agile or Scrum ways of working.

What the JD emphasized

  • production services
  • cloud hosted, containerized services
  • inference workloads
  • production ML or LLM inference services

Other signals

  • MLOps
  • production models
  • inference services
  • CI/CD
  • monitoring