Software Engineer

Anthropic Anthropic · AI Frontier · AI Research & Engineering

Software Engineer role focused on building and scaling large ML systems, improving infrastructure, efficiency, and tooling for AI research and development. The role emphasizes making safe, steerable, and trustworthy AI systems, with opportunities to work on various aspects of ML infrastructure and experiments.

What you'd actually do

  1. Optimizing the throughput of a new attention mechanism
  2. Comparing the compute efficiency of two Transformer variants
  3. Making a Wikipedia dataset in a format models can easily consume
  4. Scaling a distributed training job to thousands of GPUs
  5. Writing a design doc for fault tolerance strategies

Skills

Required

  • Significant software engineering experience
  • Results-oriented, with a bias towards flexibility and impact
  • Pick up slack, even if it goes outside your job description
  • Enjoy pair programming
  • Want to learn more about machine learning research
  • Care about the societal impacts of your work
  • High performance, large-scale ML systems
  • GPUs, Kubernetes, Pytorch, or OS internals
  • Language modeling with transformers
  • Reinforcement learning
  • Large-scale ETL
  • Security and privacy best practice expertise
  • Machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL
  • Low level systems, for example linux kernel tuning and eBPF
  • Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems

Nice to have

  • Bachelor's degree in a related field or equivalent experience

What the JD emphasized

  • significant software engineering experience
  • High performance, large-scale ML systems
  • GPUs, Kubernetes, Pytorch, or OS internals
  • Language modeling with transformers
  • Reinforcement learning
  • Large-scale ETL
  • Machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL
  • Low level systems, for example linux kernel tuning and eBPF
  • Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems

Other signals

  • building large scale ML systems
  • making safe, steerable, trustworthy systems
  • improving throughput and efficiency
  • running and designing scientific experiments
  • improving our dev tooling