Senior / Staff Machine Learning Engineer, Applied AI

Lila Sciences Lila Sciences · AI Frontier · Alewife, Cambridge, MA +1 · AI

Senior/Staff Machine Learning Engineer at Lila Sciences focused on improving AI models for customer-specific scientific needs by training, evaluating, and deploying LLMs and multi-modal models. The role bridges research and engineering, translating frontier capabilities into reliable, production-quality systems and workflows.

What you'd actually do

  1. Close the last-mile gap between Lila AI model capabilities and customer-specific scientific workflows.
  2. Build evaluation loops that measure model quality, reliability, and customer fit.
  3. Design experiments to improve model performance across applied customer use cases.
  4. Feed customer learnings, data signals, and evaluation results back into the Lila AI model improvement cycles.
  5. Partner with AI researchers to translate model improvements into usable capabilities.

Skills

Required

  • Python
  • PyTorch
  • JAX
  • TensorFlow
  • distributed ML training frameworks (Megatron-LM, TorchTitan, DeepSpeed, Ray)
  • designing experiments
  • evaluation metrics
  • test sets for model performance
  • debugging model behavior
  • working across research and engineering teams

Nice to have

  • adapting models for customer-facing or production workflows
  • scientific, technical, or data-intensive customer use cases
  • building evaluation harnesses
  • model monitoring
  • quality dashboards
  • retrieval-augmented generation
  • tool use
  • agentic workflows
  • RL post-training (RLHF, GRPO, tool-augmented RL)
  • training MoE architectures
  • working with product or customer-facing teams

What the JD emphasized

  • customer-specific scientific needs
  • turning frontier model capabilities into reliable workflows
  • evaluated, iterated, and used in real customer contexts
  • bridge research and engineering
  • model behavior works well end to end inside the application
  • model failures using traces, evaluations, customer context, and scientific feedback
  • reusable tooling for model adaptation, evaluation, and deployment workflows
  • Strong experience building, training, adapting, or evaluating machine learning models.
  • Ability to debug model behavior using data, traces, logs, and qualitative feedback.
  • Experience working across research and engineering teams to move ML capabilities into usable systems.

Other signals

  • turning frontier model capabilities into reliable workflows
  • bridge research and engineering
  • customer-specific scientific needs