Software Engineer 2 (mlops)

Toast · Enterprise · Bangalore, India · R & D : Engineering : Fintech Data Science & AI

Machine Learning Engineer II at Toast, focusing on building and deploying scalable ML and agentic AI pipelines, developing APIs, and implementing microservices with a focus on latency, reliability, and performance. The role involves MLOps best practices, model deployment, orchestration, and monitoring, working closely with cross-functional teams to deliver ML-powered products.

What you'd actually do

  1. Collaborate with machine learning engineers and data scientists to design and build scalable ML and agentic AI pipelines, develop APIs, and implement microservices with a strong focus on latency, reliability, and performance
  2. Drive engineering excellence by enforcing best practices in version control, code reviews, testing, and documentation, contributing to a high-quality and maintainable codebase
  3. Stay up to date with emerging tools, technologies, and best practices across ML engineering and cloud infrastructure, and actively contribute to continuous improvement within the team
  4. Monitor, debug, and optimize model performance and production infrastructure, ensuring robustness, scalability, and efficiency
  5. Contribute effectively in an agile development environment, embracing iterative delivery, ownership, and continuous learning

Skills

Required

  • Python
  • Java/Kotlin
  • PySpark
  • scikit-learn
  • TensorFlow
  • PyTorch
  • microservices-based architectures
  • AWS
  • MLOps best practices
  • CI/CD pipelines
  • version control (Git)
  • testing (TDD)
  • experiment tracking (MLflow)
  • workflow orchestration
  • model deployment
  • orchestration (e.g., Apache Airflow)
  • scalable infrastructure
  • end-to-end ML lifecycle
  • software engineering principles
  • object-oriented design
  • modularity
  • maintainability
  • cross-functional teams

Nice to have

  • deploying agents into production

What the JD emphasized

  • building and deploying machine learning systems in production environments
  • scalable ML and agentic AI pipelines
  • model performance and production infrastructure
  • end-to-end ML lifecycle

Other signals

  • deploying machine learning models for product lines
  • scalable ML and agentic AI pipelines
  • model performance and production infrastructure