High Performance Computing (hpc) (sa1) (government)

AT&T AT&T · Telecom · Columbia, MD

System Administrator for High-Performance Computing (HPC) environments supporting government clients, focusing on Linux clusters, parallel file systems, and high-speed interconnects. Responsibilities include installation, configuration, maintenance, monitoring, incident response, and automation via scripting. Requires TS/SCI clearance and specific IT certifications.

What you'd actually do

  1. The System Administrator provides HPC sustainment support across two geographically dispersed sites, including:
  2. Linux-based HPC clusters (e.g., Red Hat/CentOS/Rocky/Ubuntu) with parallel file systems (e.g., Lustre/GPFS) and high-speed interconnects (InfiniBand/Slingshot).
  3. Transition of new systems/capabilities into operations (clusters, SMP/MPP, parallel file systems).
  4. Support to HPC and ABS (ABUNDANTSHIELD) SRE teams in accordance with Government policies and procedures.

Skills

Required

  • B.S. in a technical discipline and 3 years of experience as a System Administrator or 8 years’ experience in lieu of degree
  • DoD 8570 IATII Level Certification
  • TS/SCI with polygraph clearance
  • Linux OS administration (Red Hat/CentOS/Rocky/Ubuntu)
  • Parallel file systems (Lustre/GPFS)
  • High-speed interconnects (InfiniBand/Slingshot)
  • TCP/IP networking
  • BASH scripting
  • Slurm
  • git
  • Salt
  • Ansible
  • Jira
  • Confluence
  • Grafana
  • Prometheus
  • Nagios

Nice to have

  • Windows and UNIX systems
  • SMP/MPP systems

What the JD emphasized

  • TS/SCI with polygraph
  • DoD 8570 IATII Level Certification