Staff Machine Learning Engineer

Zendesk Zendesk · Enterprise · Ireland +2 · Remote

Staff Machine Learning Engineer to own the ML surface of routing and presence products, transitioning from a rules-based engine to an agentic routing engine. The role involves end-to-end ML ownership from feature engineering and model design to production serving and monitoring, with a focus on applied ML for measurable customer outcomes at scale within a product engineering team. Responsibilities include designing experimentation frameworks, shaping the integration of classical ML and LLM components, and mentoring other engineers.

What you'd actually do

  1. Own the models and algorithms at the heart of Routing & Presence. You are accountable for their quality, reliability, and interoperability with the rest of the business — from Predictive Routing today through the agentic routing engine we’re building next.
  2. Plan and scope ML work with observability and iterability designed in from the start, and with a clear view of how the resulting algorithms overlap and interoperate with adjacent systems.
  3. Propose and evaluate alternatives. For any meaningful design decision, surface the candidate approaches, the tradeoffs, and the reasoning — don’t jump to the first plausible solution.
  4. Design experimentation frameworks (offline evals and online A/B) tailored to routing outcomes, with statistical rigour and a clear tie-back to customer-facing metrics.
  5. Lead innovation sessions with Product ahead of the product development cycle — help shape _what_ we build, not just _how_.

Skills

Required

  • Classical ML (probabilistic models, feature engineering, optimisation, model selection)
  • Generative AI and LLMs (fine-tuning, RAG, prompt engineering, agentic patterns)
  • Design and run experiments end to end (power analysis, guardrail metrics, offline evals vs online impact)
  • Ship ML in production (latency, throughput, drift detection, retraining, versioning, rollback, operational realities)
  • Estimate ML project effort
  • Communicate designs and decisions clearly
  • Motivate research outcomes in terms of wider product strategy and long-term business goals
  • Collaborate deeply with backend engineers (Java, Scala)
  • Explain model behaviour, uncertainty, and limitations honestly to non-technical stakeholders
  • Translate product problems into ML problems (and back again)

Nice to have

  • Python
  • Scala
  • Java

What the JD emphasized

  • agentic routing engine
  • embedded ML expert
  • end to end
  • measurable customer outcomes
  • reliably, at scale
  • technical direction
  • agentic routing engine
  • offline evals and online A/B
  • customer-facing metrics
  • classical ML
  • LLM-driven components
  • explainability and guardrails
  • full lifecycle
  • multiple teams and systems
  • 18+ months
  • agentic routing engine
  • subject-matter expert
  • classical ML
  • generative AI and LLMs
  • fine-tuning, RAG, prompt engineering, agentic patterns
  • Design and run experiments end to end
  • offline evals and online impact
  • Ship ML in production
  • backend engineers
  • Java and Scala
  • Python
  • model behaviour, uncertainty, and limitations
  • non-technical stakeholders
  • product problems into ML problems

Other signals

  • agentic routing engine
  • embedded ML expert
  • end-to-end ML ownership
  • technical direction for ML