Staff / Senior Software Engineer, Infrastructure

Suno Suno · Multimodal · Boston, MA · Engineering

Staff/Senior Software Engineer, Infrastructure at Suno, a consumer entertainment company focused on AI music. The role involves building and operating foundational systems, platforms, and tooling that support the entire company, including cloud, compute, developer platforms, and AI-powered tools. Key areas include scaling Kubernetes, building distributed databases, designing inference infrastructure, creating developer experience platforms, and building shared AI infrastructure and agentic workflows. The role requires strong experience in distributed systems, cloud services, and operating systems at scale, with a focus on engineering excellence and ownership.

What you'd actually do

  1. Architect and build services to handle massive consumer traffic, data, and usage
  2. Design systems that are performant, secure, scalable, and easy to observe
  3. Own systems end-to-end — from design and implementation through deployment, monitoring, and operational excellence
  4. Lead by example on engineering excellence — code quality, system design, documentation, and operational maturity
  5. Collaborate with engineering teams across the company to understand their needs and build the right abstractions

Skills

Required

  • 5–7+ years of infrastructure, backend, or systems engineering experience
  • Experience building and operating systems at significant scale in production
  • Strong understanding of distributed systems, cloud services (AWS/GCP), and modern infrastructure patterns
  • Experience with some combination of: Kubernetes, Docker, infrastructure as code (Pulumi/Terraform/CDK), databases (Postgres, distributed relational databases), caching systems, or container orchestration
  • Ability to reason through hard scaling, reliability, and performance problems with clear technical judgment
  • High ownership — you drive projects end-to-end without waiting for direction
  • Strong communication skills — you keep stakeholders informed and reduce ambiguity for the teams you serve
  • An obsession with engineering excellence, iterating and learning rapidly, and working hard

Nice to have

  • Deep experience with Kubernetes at scale — cluster management, control plane scaling, multi-tenancy
  • Experience with large-scale databases, distributed data layers, or storage systems
  • Experience with ML infrastructure — inference serving, ML data pipelines, MLOps, GPU infrastructure
  • Experience on a platform or developer experience team where your primary customers were other engineers
  • Experience building internal systems 0→1 (auth, notifications, CDN, DevEx tooling, or similar)
  • Strong oncall instincts — triage, debug, and resolve incidents across a distributed stack
  • Hands-on familiarity with AI tooling and the current landscape of AI for software engineering — models, agents, coding assistants, and agentic workflows
  • Golang or Rust experience, especially for large-scale systems
  • Experience with websockets, CDNs, streaming traffic patterns, and audio/video delivery
  • Security best practices in building and scaling infrastructure
  • Technical leaders

What the JD emphasized

  • AI-powered tools and workflows
  • Designing inference infrastructure
  • Building shared AI infrastructure, agentic tooling, and workflows
  • Creating an agent-friendly knowledge graph and data connections
  • Defining best practices, frameworks, and evaluation systems for how Suno uses AI tooling
  • Experience building and operating systems at significant scale in production
  • Strong understanding of distributed systems, cloud services (AWS/GCP), and modern infrastructure patterns
  • Ability to reason through hard scaling, reliability, and performance problems with clear technical judgment
  • High ownership — you drive projects end-to-end without waiting for direction
  • An obsession with engineering excellence, iterating and learning rapidly, and working hard

Other signals

  • AI-powered tools and workflows
  • Designing inference infrastructure
  • Building shared AI infrastructure, agentic tooling, and workflows
  • Creating an agent-friendly knowledge graph and data connections
  • Defining best practices, frameworks, and evaluation systems for how Suno uses AI tooling