Manager, Ai/ml Ops Engineering (hybrid in Bangalore)

Smartsheet Smartsheet · Seattle · India · Business Intelligence & Ops

Manager AI/ML Ops Engineering to lead and mentor a team, define and govern AI/ML operations solutions for scalability, cost-efficiency, and reliability, and develop standardized AI/MLOps workflows including CI/CD/CT pipelines. The role requires experience in enterprise SaaS, building and maintaining AI/ML Ops platform systems, and knowledge of AI/ML frameworks and cloud platforms.

What you'd actually do

  1. Focus on team members and coaching them to play to their strengths, grow and deliver peak performance
  2. Lead and mentor a team of AI/ML Ops engineers and operations specialists.
  3. Define and govern solutions for AI/ML operations, ensuring scalability, cost-efficiency, and reliability
  4. Develop and maintain standardized AI/MLOps workflows, including CI/CD/CT (Continuous Integration/Continuous Delivery/Continuous Training) pipelines
  5. Delegate and harness the aggregate strength of your team.

Skills

Required

  • Enterprise SaaS software solutions with high availability and scalability
  • Experience building teams through recruiting and retention
  • Experience in Leading and Mentoring a team of ML engineers and operations specialists.
  • Experience in building and maintaining AI/ML Ops platform systems ensuring scalability, reliability, efficiency and security
  • AI/MLOps workflows on Databricks , MLFlow, Mosaic AI Agent Framework, Unity Catalog, Vector Search, Knowledge Graph
  • Knowledge of AI/ML frameworks like LangChain, LangGraph for AI/ML Ops pipeline integration
  • Cloud Platforms: Hands-on experience with at least one major cloud provider (AWS, Azure, or GCP). Experience in AWS hosted data platform is preferable
  • Programming languages like Python and SQL
  • Modern software engineering practices like Kubernetes, CI/CD, IAC tools (Preferably Terraform), Observability, monitoring and alerting
  • Solution Cost Optimisations and design to cost

Nice to have

  • AWS hosted data platform is preferable

What the JD emphasized

  • AI/ML Ops platform systems
  • CI/CD/CT
  • regulatory compliance, security, and ethical guidelines

Other signals

  • AI/ML Ops platform systems
  • CI/CD/CT pipelines
  • scalability, cost-efficiency, and reliability
  • regulatory compliance, security, and ethical guidelines