Engineering Manager, Inference

Anthropic Anthropic · AI Frontier · New York, NY +2 · Remote · AI Research & Engineering

Engineering Manager for Anthropic's performance and scaling teams, focusing on improving model performance and scaling inference and training systems. Responsibilities include front-line leadership, managing day-to-day execution, prioritizing work, and coaching reports. Requires management experience in technical environments, background in ML/AI, and interest in safe AI development.

What you'd actually do

  1. Provide front-line leadership of engineering efforts to improve model performance and scale our inference and training systems
  2. Become familiar with the team’s technical stack enough to make targeted contributions as an individual contributor
  3. Manage day-to-day execution of the team's work
  4. Prioritize the team’s work and manage projects in a highly dynamic, fast paced environment
  5. Coach and support your reports in understanding, and pursuing, their professional growth

Skills

Required

  • 1+ years of management experience in a technical environment
  • background in machine learning, AI, or a similar related technical field
  • building strong relationships with stakeholders
  • quick learner, capable of understanding and contributing to discussions on complex technical topics
  • experience managing teams through periods of rapid growth and change

Nice to have

  • performance or distributed systems
  • deeply interested in the potential transformative effects of advanced AI systems and are committed to ensuring their safe development
  • High performance, large-scale ML systems
  • GPU/Accelerator programming
  • ML framework internals
  • OS internals
  • Language modeling with transformers

What the JD emphasized

  • performance
  • scale
  • inference
  • training systems
  • bottlenecks
  • efficiency
  • performance
  • distributed systems
  • machine learning
  • AI
  • advanced AI systems
  • safe development
  • complex technical topics
  • rapid growth and change
  • complex technical systems

Other signals

  • performance and scaling teams
  • making the most efficient and impactful use of our compute resources
  • inference and training systems
  • identifying and removing bottlenecks
  • building robust and durable solutions
  • maximizing the efficiency of our systems