Senior Data Center Operations Technician

xAI xAI · AI Frontier · Memphis, TN · Data Center

This role is for a Senior Data Center Operations Technician at xAI, responsible for the health of server and network infrastructure. Key metrics include mean time to detect (MTTD) and mean time to repair (MTTR). Responsibilities include troubleshooting, monitoring, rack and stack of equipment, inventory management, cabling, power supply work, hardware installation, ticket management, documentation, and hardware decommissioning. Requires 5+ years of experience with server, storage, compute, and network hardware, troubleshooting, and inventory management. Basic qualifications include a high school diploma and physical requirements for data center work. Preferred skills include strong Linux skills, on-call experience, and project leadership.

What you'd actually do

  1. Performing troubleshooting and monitoring of the servers and network in our data centers and global points of presence
  2. Installation of racks, servers and switches; this includes staging racks in place, cabling, power up and handoff of hardware to the provisioning team for customer capacity allocation.
  3. Manage, response and resolution of data center operations tickets used cross functionally within xAI via Jira.
  4. Define, design, and implement network layouts and solutions within our data centers
  5. Create and Maintain documentation of tasks and standard operating procedures

Skills

Required

  • server hardware
  • storage hardware
  • compute hardware
  • network hardware
  • troubleshooting servers
  • troubleshooting networking infrastructure
  • Inventory Management
  • ordering server and network equipment
  • receiving server and network equipment
  • shipping server and network equipment
  • Jira

Nice to have

  • Linux
  • Bash scripting
  • on-call experience
  • Data Center Infrastructure projects
  • Structured Cabling Copper/Fiber
  • Power and Cooling concepts