Senior Machine Learning Engineer

Rubrik Rubrik · Enterprise · Palo Alto, CA · Engineering

Rubrik is seeking a Senior Machine Learning Engineer to join their SAGE team, focusing on building and productionizing "LLM-as-judge" systems for real-time AI agent governance and remediation. The role involves end-to-end model lifecycle management, including data curation, training small models, serving them at production latency, and closing the feedback loop with customer signals. The team emphasizes shipping models to enterprise customers quickly and proving the efficacy of small, specialized models over larger frontier models for AI safety and governance.

What you'd actually do

  1. Owning the full training lifecycle for the SLMs and classifiers in SAGE's real-time enforcement path, including base-model selection, supervised fine-tuning, preference optimization (DPO/RLAIF), and distillation from frontier teacher models.
  2. Designing multi-stage inference pipelines that handle both real-time enforcement (inline prompt, response, and tool-call blocking) and high-throughput batch workloads (offline scoring, back-testing, corpus mining) while processing billions of tokens daily across Global 2000 customer agent fleets.
  3. Designing automated data curation pipelines that mine live customer environments (with privacy and tenancy guarantees) for high-value per tenant training examples, such as long-tail violations, near-miss policy edges, or novel agent behaviors, and routing them back into the training loop for each customer.
  4. Building memory and context harnesses that fuse data sensitivity, identity, and historical agent behavior into real-time enforcement decisions to ensure SAGE reasons from each customer's specific context.
  5. Cross-Functional Collaboration and Translating Customer Reality into

Skills

Required

  • Machine Learning
  • Model Training
  • Fine-tuning
  • Distillation
  • Preference Optimization (DPO/RLAIF)
  • Model Serving
  • Inference Optimization
  • Quantization
  • Synthetic Data Generation
  • Evaluation Frameworks
  • Data Curation
  • Anomaly Detection
  • Adversarial Training
  • Red Teaming
  • Python
  • MLOps

Nice to have

  • Small Language Models (SLMs)
  • LLM-as-judge
  • AI Governance
  • AI Safety
  • Agent Rewind
  • LoRA
  • GRPO
  • KV-cache-aware routing
  • Continuous Batching
  • Speculative Decoding
  • Model Gateway Design
  • Canary/Shadow/A/B testing
  • Online Evaluation
  • Drift Detection
  • Calibration Monitoring
  • Policy Coverage Analysis
  • Context Harnesses
  • Natural Language Policy Refinement

What the JD emphasized

  • real-time enforcement
  • production traffic
  • production latency
  • live request path
  • production traffic patterns
  • live deployments
  • live customer traffic
  • live model decisions
  • live customer environments
  • live agent traffic
  • production decisions

Other signals

  • LLM-as-judge
  • AI governance
  • real-time enforcement
  • small language models
  • AI safety