Command Center Systems Engineer

Weights & Biases Weights & Biases · Data AI · Kenilworth, NJ, DC · Data Center - G&A

This role is for a Command Center Systems Engineer at CoreWeave, focusing on building and maintaining the operational technology layer for their global data center fleet. The engineer will be responsible for real-time visibility, intelligent alerting, and automated response systems, integrating various infrastructure platforms and automating manual workflows to improve uptime and reduce risk. The role requires experience in data center operations or SRE, proficiency in scripting languages like Python or Go, and familiarity with enterprise monitoring and observability platforms.

What you'd actually do

  1. Own the evolution of the Command Center's "Single Pane of Glass", integrating signals from DCIM, EPMS, BMS, and other infrastructure platforms into a unified, real-time operational view.
  2. Transition alerting from threshold-based noise to intelligent, correlated logic using platforms such as Grafana, Prometheus, Ignition, or equivalent tools.
  3. Automate manual operator workflows, including reporting, vendor check-ins, initial triage, and ticket creation.
  4. Design and maintain data visualization dashboards and KPI reporting tools that give operators and leadership clear, actionable insight into fleet health and performance.
  5. Integrate and optimize Jira workflows to accelerate incident response, change management tracking, and task automation.

Skills

Required

  • Python
  • Go
  • infrastructure automation
  • workflow orchestration
  • enterprise monitoring platforms
  • observability platforms
  • Grafana
  • Prometheus
  • Ignition
  • SCADA
  • BMS
  • EPMS
  • BACnet
  • SQL
  • NoSQL
  • time series data
  • telemetry platforms
  • data visualization
  • technical documentation

Nice to have

  • hyperscale or AI infrastructure environments
  • historian/telemetry platforms
  • time series data pipelines
  • Agile development practices
  • physical infrastructure (Power, Cooling, Networking)
  • high-density GPU workloads
  • Jira project structures and automations
  • Lean or Six Sigma methodology

What the JD emphasized

  • mission-critical environment
  • Python, Go, or other scripting languages for infrastructure automation and workflow orchestration
  • enterprise monitoring and observability platforms (Grafana, Prometheus, Ignition, or similar)
  • SCADA, BMS, EPMS, and BACnet-type systems
  • time series data, telemetry platforms, and data visualization best practices
  • high-availability environments