Storage Systems Administrator II

Crusoe Crusoe · Data AI · San Francisco, CA - US · Cloud Engineering

This role is for a Storage Systems Administrator II at Crusoe, an AI infrastructure company. The primary responsibility is the reliability and maintenance of all-flash storage ecosystems (VAST Data and Pure Storage) to ensure high-speed data access for large-scale AI training. The role involves daily administration, health monitoring, maintenance, data integrity, troubleshooting, and task automation using scripting.

What you'd actually do

  1. Manage the daily administration of VAST Data and Pure Storage environments, including volume provisioning, export management, and quota adjustments.
  2. Use tools like Grafana and Prometheus to monitor cluster health, tracking IOPS and latency to identify potential bottlenecks before they impact users.
  3. Assist in executing non-disruptive software upgrades (VAST OS, Purity//FB) and hardware expansions to keep our infrastructure modern and secure.
  4. Implement and verify snapshot schedules and replication policies to ensure data durability and successful recovery points.
  5. Resolve storage-related tickets and performance issues, collaborating with senior engineers and vendor support (VAST/Pure) to minimize downtime.

Skills

Required

  • 2–6 years of experience in Storage or Systems Administration
  • Managing enterprise-grade storage arrays
  • Linux CLI
  • Mounting file systems
  • Basic network configuration
  • NFS
  • SMB
  • NVMe-oF
  • Python
  • Bash
  • API interaction
  • Automating repetitive system tasks
  • Documentation
  • Change management

Nice to have

  • VAST Data
  • Pure Storage
  • FlashBlade
  • FlashArray
  • Pure1
  • VAST VMS/Insight
  • InfiniBand
  • RoCE networking
  • Data center environment
  • High-performance computing (HPC) workloads