Engineering Manager, Runtime Fabric

Baseten Baseten · Data AI · San Francisco, CA · EPD

Engineering Manager for the Runtime Fabrics team, responsible for purpose-building container runtime and storage layers for AI inference workloads. This involves leading a team of systems engineers, setting technical direction, and contributing to open-source container projects. The role focuses on optimizing container runtimes for AI inference, addressing issues like GPU memory, image pulling times, and multi-tenant isolation.

What you'd actually do

  1. Recruit, hire, and develop a high-performing team of systems engineers with deep container and Linux expertise.
  2. Foster a culture of technical rigor, open-source contribution, and continuous improvement.
  3. Provide regular coaching, feedback, and career development support to your direct reports.
  4. Partner with engineering leadership to define the long-term vision and roadmap for container runtime and storage infrastructure.
  5. Guide the team in extending and hardening containerd, runc, and related OCI ecosystem projects to meet the GPU-specific requirements of production AI inference, including startup performance, GPU device access, and multi-tenant isolation.

Skills

Required

  • Proven experience managing and growing engineering teams in a systems, infrastructure, or low-level runtime context.
  • Deep familiarity with the Linux container ecosystem: containerd, runc, OCI Runtime Spec, Linux namespaces, and cgroups, with the ability to engage credibly in code reviews and architectural discussions.
  • Contributions to containerd/containerd, opencontainers/runc, google/gvisor, kata-containers/kata-containers, or closely related open-source projects.
  • Strong systems programming background in Go and/or C/C++.
  • Experience with distributed storage systems, content-addressable storage, or large-scale caching infrastructure.
  • Understanding of how container images are structured, stored, and delivered at scale.
  • Strong written and verbal communication skills, with the ability to influence without authority across teams.

Nice to have

  • Experience with GPU device access in containers: NVIDIA Container Toolkit, CDI (Container Device Interface), or GPU-aware scheduling.
  • Familiarity with lazy-loading snapshotters (stargz, soci, EROFS/Nydus) or peer-to-peer image distribution.
  • Experience with secure container runtimes (gVisor, Sysbox) or micro-VM technologies (Firecracker, Cloud Hypervisor).
  • Understanding of containerd's shim API (v2) and experience building custom shim implementations.
  • Background in multi-tenant infrastructure or security-sensitive serving environments.

What the JD emphasized

  • deep container and Linux expertise
  • containerd, runc, and related OCI ecosystem projects
  • GPU-specific requirements of production AI inference
  • startup performance
  • GPU device access
  • multi-tenant isolation
  • Baseten Delivery Network
  • cold starts
  • burst scaling events
  • model weights
  • container images
  • training checkpoints
  • deployment artifacts
  • GPU-aware isolation mechanisms
  • secure container runtimes
  • Linux namespace hardening
  • micro-VM integration
  • end-to-end ownership
  • container startup performance path
  • snapshotter initialization
  • weight delivery
  • first inference request
  • open-source containerd ecosystem
  • core maintainers