Manager Data Center Operations

Roblox Roblox · Consumer · Goodyear, AZ · Engineering Operations

Roblox is seeking a Manager, Data Center Operations to oversee and scale their core data center and hardware infrastructure. This role involves leading a team of engineers, managing rack deployments, hardware troubleshooting, and improving infrastructure standards to ensure high availability for millions of concurrent users. The position requires extensive experience in data center environments, server/network equipment management, and team leadership.

What you'd actually do

  1. Develop and maintain the Core Data Center and hardware infrastructure to meet the large-scale and real-time requirements of our Imagination Platform™ to ensure our community has an awesome experience anywhere in the world. This includes all aspects of the server, network infrastructure, power, and environmental monitoring.
  2. Lead a growing team of data center engineers focusing on rack deployments, hardware troubleshooting and break-fix, and decommissioning.
  3. Identify and solve critical problems and prevent them from re-occurring via root cause analysis and giving recommendations to improve automation. Guide, train and educate staff on best practices related to break/fix tasks related to the server hardware and network infrastructure.
  4. Create, influence, and improve the development platform, infrastructure, metrics, standards (Runbooks, SOPs, MOPs), and methods to ensure the goal of scalability and high availability can be achieved.
  5. Participate in the on-call rotation for our critical infrastructure.
  6. Build and implement Core sites around the world including low voltage cabling, creating BOMs, vendor management.

Skills

Required

  • 10+ years of experience working in large-scale Data Center Infrastructure environments
  • 5+ years experience leading a team of 3 or more data center engineers
  • Extensive experience installing, monitoring, and maintaining server and network equipment
  • In-depth knowledge of data center environments, servers, and network equipment
  • Proven experience executing on multiple tasks simultaneously
  • Experience installing various equipment that commonly resides in the data center environment
  • ability to lift 75 pounds occasionally

Nice to have

  • server and network hardware troubleshooting support
  • complex problems
  • data to test theories
  • formulate and identify problems
  • generate and evaluate a variety of solutions
  • implement the best one(s)

What the JD emphasized

  • extensive experience
  • Proven experience
  • extensive experience
  • 5+ years experience leading a team of 3 or more data center engineers