Machine Learning Systems Engineer, Model Apis

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Machine Learning Systems Engineer focused on building and maintaining Model Evaluations infrastructure and Research Inference APIs/infrastructure to enable researchers to effectively evaluate models and conduct inference tasks, directly impacting AI advancement.

What you'd actually do

  1. Design, build, and maintain Model Evaluations infrastructure that enables researchers to systematically test and assess model capabilities
  2. Develop and optimize APIs and infrastructure for Research Inference to accelerate the model development lifecycle
  3. Create scalable data pipelines for collecting, processing, and analyzing research outputs
  4. Implement monitoring, logging, and performance optimization for research-focused inference systems
  5. Build intuitive interfaces and tools that allow researchers to configure, run, and analyze complex evaluation workflows

Skills

Required

  • 5+ years of software engineering experience
  • significant software engineering experience
  • Python proficiency
  • experience with cloud infrastructure (AWS, GCP)
  • experience with data infrastructure and processing large datasets
  • excellent communication skills
  • ability to collaborate effectively with research teams
  • ability to work independently and take ownership

Nice to have

  • ML experience
  • High performance, large-scale ML systems
  • GPUs, Kubernetes, PyTorch, or ML acceleration hardware
  • Building evaluation frameworks for machine learning models
  • Working in or adjacent to ML research teams
  • Distributed systems design and optimization
  • Real-time inference systems for large language models
  • Performance profiling and optimization
  • Infrastructure as Code and CI/CD pipelines

What the JD emphasized

  • Model Evaluations infrastructure
  • Research Inference
  • systematic test and assess model capabilities
  • accelerate the model development lifecycle
  • scalable data pipelines
  • monitoring, logging, and performance optimization
  • research-focused inference systems
  • complex evaluation workflows
  • high performance, large-scale ML systems
  • real-time inference systems for large language models
  • performance profiling and optimization

Other signals

  • Model Evaluations infrastructure
  • Research Inference APIs and infrastructure
  • scalable systems for researchers
  • accelerate the model development lifecycle
  • monitoring, logging, and performance optimization for research-focused inference systems