Principal/senior Principal Engineer Systems – High Performance Computing (california)

Northrop Grumman Northrop Grumman · Aerospace · Redondo Beach, CA +1 · Systems/Architecture/Test

Northrop Grumman is seeking a Principal/Senior Principal Systems Engineer for High Performance Computing (HPC) to support Survivability Computational Electromagnetic codes and tools. This role involves managing HPC systems throughout their lifecycle, performing technical planning, system integration, and identifying/resolving performance bottlenecks. The position also requires implementing AI/ML to support code deployment, maintenance, and optimization efforts, and will work closely with a Subject Matter Expert to push the envelope in scientific electromagnetics.

What you'd actually do

  1. Collaborate with IT, Cybersecurity and HPC engineering to manage HPC systems throughout life cycle (concept, design, fabrication, test, installation, operation, maintenance, and disposal)
  2. Perform HPC technical planning, system integration, verification and validation, cost and risk evaluation, and supportability and effectiveness analyses:
  3. Monitor and report HPC system health/metrics. Gather data; perform analysis; reproduce, resolve or escalate, and drive issues through to closure.
  4. Identify and resolve system performance bottlenecks.
  5. Diagnose hardware and software configuration issues.

Skills

Required

  • Bachelor’s Degree in a STEM discipline and 5 years of related engineering experience (Principal) OR Bachelor's degree in STEM with 8 years of related engineering experience (Senior Principal)
  • Experience with HPC system design, supercomputer implementation, HPC networked architecture and parallel processing optimization.
  • Knowledgeable in parallel programming (CUDA, inlining, vectorization, concurrentization and parallelization of software).
  • Knowledgeable in Linux systems management, e.g. parallel file systems, memory management, and kernel optimization.
  • Programming experience with C++ or similar programming languages
  • Active in-scope DoD Top Secret Security Clearance

Nice to have

  • Security+ Certified (DOD 8570/8140)
  • Expertise with programming languages such as C++, C#, Java, Python, and CUDA.
  • Experience with Software compilation process (compilers, GNU Make, CMake, etc).
  • Experience with architecture and design (architecture, design patterns, reliability, and scaling) of new and existing HPC systems.
  • Strong knowledge of HPC technologies, storage, parallel processing, and scientific computing frameworks.
  • Experience building and maintaining high scale distributed systems and running scalable distributed application software.
  • Hands-on experience with HPC hardware diagnostic and repair.
  • Master or above in Computer Engineering, Computer Science, or System Engineering from an accredited college or university.
  • Familiarity with cybersecurity requirements and methodologies. Working knowledge of JSIG and STIG.

What the JD emphasized

  • Active in-scope DoD Top Secret Security Clearance is required to start, with the ability to obtain and maintain clearance to Special Access Programs (SAPs)

Other signals

  • implement Artificial Intelligence / Machine Learning to support code deployment, maintenance, and optimization efforts
  • HPC Systems Engineering
  • parallel processing optimization
  • CUDA
  • Linux systems management