Senior Staff Software Engineer, Storage

Crusoe · Data AI · San Francisco, CA - US · Cloud Engineering

Senior Staff Software Engineer to architect and drive the long-term technical strategy for Crusoe's storage engine, bridging high-performance hardware and globally distributed object stores. Focus on system programming, storage protocols, open-source contributions, and deep performance engineering for AI-scale infrastructure.

What you'd actually do

  1. Define and drive the long-term technical strategy for Crusoe’s storage engine.
  2. Leverage proven experience in system programming with languages such as C, C++, Go, and/or Rust to build the foundations of our V2 storage re-architecture.
  3. Architect and implement solutions utilizing industry-standard storage protocols such as NFS, SMB, iSCSI, and NVMe/TCP.
  4. Drive and maintain a track record of contributions to the open-source community (e.g., Ceph, GlusterFS, Lustre, Spectrum Scale, OpenEBS).
  5. Serve as the final arbiter for critical architecture decisions across the Foundations organization.

Skills

Required

  • 12+ years of experience building and operating large-scale, complex distributed cloud computing infrastructure products.
  • Strong troubleshooting and performance tuning skills; ability to profile and optimize the entire IO path.
  • Masters of Consistency & Durability: Deep theoretical and practical knowledge of distributed state and data protection at petabyte scale.
  • Mastery of professional software engineering practices for the full SDLC, including coding standards, build processes, and testing.
  • Ability to champion and lead initiatives across the engineering organization, such as tech talks and technical reading groups.

Nice to have

  • Expertise in one or more Public Cloud offerings (AWS, GCP, Azure, OCI) and familiarity with AI/ML frameworks (PyTorch, Tensorflow, JAX) and MLOps.
  • Experience with cutting-edge I/O architectures like DAOS or SPDK.
  • Background in RDMA and high-performance networking, including SmartNICs and RoCEv2.
  • Experience with highly available and scalable systems such as Cassandra, MongoDB, Redis, or Kafka.
  • Strong knowledge of distributed systems fundamentals including CAP Theorem, Paxos/RAFT, consistent hashing, and sharding strategies.
  • Advanced degree (Master's or PhD) in Computer Science, Engineering, or a related field.

What the JD emphasized

  • AI-scale infrastructure
  • long-term technical strategy
  • V2 storage re-architecture
  • final arbiter for critical architecture decisions