Principal Software Engineer, Enterprise Scalability

Klaviyo Klaviyo · Enterprise · Boston, MA · Engineering

Principal Software Engineer focused on enterprise scalability for a multi-tenant SaaS platform. The role involves defining scalability fitness functions, designing sharding and partitioning strategies, and integrating AI for proactive anomaly detection, synthetic load generation, and guided runbooks to improve performance and resiliency. The engineer will lead architectural changes, hunt bottlenecks, and partner with teams to implement improvements, with a strong emphasis on cross-organizational influence and communicating technical work to business impact.

What you'd actually do

  1. Define enterprise scalability fitness functions (latency/throughput/error rates) and a scorecard; align teams to SLOs and budgets.
  2. Design/implement sharding and partitioning strategies, caching/back‑pressure, multi‑region readiness, and high‑volume migration paths.
  3. Build lightweight enablement: benchmarks, profiling harnesses, reproducible testbeds; pair with teams to land fixes.
  4. Lead scalability reviews and readiness gates that accelerate—not block—delivery; drive incident deep dives tied to systemic fixes.
  5. Communicate clearly to execs and engineers, tying technical work to business impact and customer outcomes.

Skills

Required

  • 12+ years scaling multi‑tenant SaaS
  • Performance engineering
  • Capacity planning
  • Sharding/partitioning
  • Caching/back‑pressure
  • Multi-region readiness
  • High-volume migrations
  • Cross-org influence
  • Technical communication

Nice to have

  • Company-wide fitness functions adoption
  • Documented, reproducible testbeds
  • AI-driven anomaly detection
  • Generative load testing
  • Copilot runbooks
  • Reduced time-to-isolate regressions

What the JD emphasized

  • enterprise scalability
  • multi-tenant SaaS
  • performance engineering
  • capacity planning
  • sharding/partitioning
  • caching/back-pressure
  • multi-region readiness
  • high-volume migrations
  • AI tools & automation
  • explicit guardrails and observability
  • Cross-org influence
  • fitness functions, scorecards, and readiness gates
  • AI fluency

Other signals

  • enterprise scalability
  • multi-tenant SaaS
  • performance engineering
  • capacity planning
  • AI for scale