Staff Platform Engineer

Amplitude Amplitude · Data AI · San Francisco, CA · Engineering : Infrastructure

Staff Platform Engineer to build and scale the AI-augmented cloud platform, enabling AI agents and human engineers to ship code more efficiently and safely. This role involves setting technical direction, driving cross-team initiatives, and owning infrastructure-as-code standards for Kubernetes, AWS, and GCP.

What you'd actually do

  1. Set technical direction — shape platform and domain-level technical strategy that improves developer experience, reliability, security, and cost, and lead the high-complexity, cross-cutting initiatives that deliver it with measurable impact for the organization.
  2. Build the AI-augmented platform. Design org-wide tooling, guardrails, and policy-as-code that help every engineer get more out of AI-assisted development — infra primitives an LLM can safely reason about and PR against, automated review, and standards that hold as AI changes how code gets written.
  3. Own Infrastructure-as-Code standards for Kubernetes, AWS, and GCP using Terraform, Helm, Kustomize, and emerging tooling — setting the patterns other teams adopt and making the platform consumable enough that humans and agents can safely extend it.
  4. Evolve our CI/CD backbone (Argo CD / Workflows / Rollouts, GitHub Actions) into shared, well-reasoned standards that make deploys faster, safer, and easier to reason about across the engineering org.
  5. Instrument and operate. Drive observability with Datadog and Amplitude, set and raise SLOs for the team, own the dashboards, and drive improvements to the shared services and dependencies that move them.

Skills

Required

  • Kubernetes
  • AWS
  • GCP
  • Terraform
  • Helm
  • Kustomize
  • Golang
  • Python
  • GitOps
  • Argo CD
  • GitHub Actions
  • Datadog
  • CI/CD
  • cloud infrastructure

Nice to have

  • Azure

What the JD emphasized

  • AI agents are first-class users
  • AI-augmented platform
  • infra primitives an LLM can safely reason about and PR against
  • policy-as-code
  • humans and agents can safely extend it

Other signals

  • AI Agents embedded across our platform
  • rebuilding them for the AI era
  • AI agents are first-class users
  • Build the AI-augmented platform
  • infra primitives an LLM can safely reason about and PR against