High-performance Computing (hpc) (sa2) (government)

AT&T AT&T · Telecom · Columbia, MD

System Administrator role focused on High-Performance Computing (HPC) environments, primarily Linux-based clusters, parallel file systems, and high-speed interconnects. Responsibilities include installation, configuration, monitoring, incident response, troubleshooting, automation via scripting, and system hardening for a government client. Requires TS/SCI clearance and DoD 8570 IAT II certification.

What you'd actually do

  1. Linux-based HPC clusters (e.g., Red Hat/CentOS/Rocky/Ubuntu) with parallel file systems (e.g., Lustre/GPFS) and high-speed interconnects (InfiniBand/Slingshot).
  2. Install/configure Linux OS, file systems, and TCP/IP networking; troubleshoot OS and application issues.
  3. Automate/administer via BASH scripting; compile/install software as required.
  4. Use common operations and observability tooling: Jira, Confluence, Grafana, Prometheus, Nagios.
  5. Support HPC workload and configuration management tooling: Slurm, git, Salt, Ansible.

Skills

Required

  • B.S. in a technical discipline and 5 years’ experience as a System Administrator
  • DoD 8570 IAT II level certification
  • TS/SCI with polygraph clearance
  • Linux OS administration (Red Hat/CentOS/Rocky/Ubuntu)
  • Parallel file systems (Lustre/GPFS)
  • High-speed interconnects (InfiniBand/Slingshot)
  • TCP/IP networking
  • BASH scripting
  • Slurm
  • git
  • Salt
  • Ansible
  • Jira
  • Confluence
  • Grafana
  • Prometheus
  • Nagios

Nice to have

  • Windows OS administration
  • UNIX administration
  • Troubleshooting heterogeneous systems

What the JD emphasized

  • TS/SCI with polygraph
  • DoD 8570 IAT II level certification required