Customer Engineer

Anyscale Anyscale · Data AI · San Francisco, CA +1 · Customer Solutions Group

Customer Engineer role focused on post-sale customer success, onboarding, adoption, and issue resolution for Anyscale's platform (based on Ray) which supports scalable machine learning. Requires a technical background in ML engineering or distributed ML infrastructure, with a focus on LLM training, fine-tuning, inference, and serving.

What you'd actually do

  1. Resolve customer issues and help in their successful adoption of Anyscale platform
  2. Be a technical advisor, and internal champion for our key customers
  3. Own customer issues end-to-end, from troubleshooting, triaging, escalations and eventual resolution
  4. Participate in our follow-the-sun customer support model to ensure continuity in resolving high priority tickets
  5. Keep track of open customer bugs and feature requests to influence prioritization and provide timely customer updates upon resolution

Skills

Required

  • 3+ years of experience in a customer-facing technical role
  • Strong organizational skills
  • Ability to manage multiple customer needs simultaneously
  • Proficient as a machine learning engineer developing data pipelines for training, fine-tuning and inference/serving of LLMs
  • Experience running and optimizing infrastructure for distributed ML workloads on the major cloud platforms (AWS/EKS, GCP/GKE or Azure/AKS)
  • Excellent communication and interpersonal skills
  • Strong sense of ownership
  • Self-motivation
  • Eagerness to acquire new skills and do new things
  • Willingness to uplevel the knowledge and skills of your peers through mentorship, trainings and shadowing

Nice to have

  • Experience with Ray
  • Knowledge of MLOps platforms
  • Knowledge of container orchestration platforms (e.g., Kubernetes)
  • Infrastructure as code (e.g. Terraform)
  • CI/CD tools (e.g. Github Actions)

What the JD emphasized

  • resolve customer issues
  • troubleshooting
  • customer tickets
  • ML/AI, LLM, vLLM
  • machine learning engineer developing data pipelines for training, fine-tuning and inference/serving of LLMs
  • running and optimizing infrastructure for distributed ML workloads