System Software Engineer - Performance Lab

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA

System Software Engineer focused on performance optimization for GPU-accelerated financial AI workloads, including deep learning training and inference, and benchmarking on HPC clusters. The role involves working with containerized applications, performance data visualization, and building reference models for NVIDIA in the fintech domain.

What you'd actually do

  1. Writing and maintaining containerized GPU accelerated workloads for the financial services industry, from deep learning training and inference, to portfolio optimization and backtesting.
  2. Running, validating, and analyzing benchmarking models at scale on HPC clusters.
  3. Visualizing performance data, building charts and dashboards using internal schemas and tooling.
  4. Working closely with the latest and greatest in financial AI models and tooling to help build reference models for NVIDIA.

Skills

Required

  • Python
  • Linux command-line environment
  • version control
  • machine learning lifecycle

Nice to have

  • PyTorch
  • training machine learning models
  • testing machine learning models
  • evaluating machine learning models
  • GPU computing
  • CUDA
  • cuOPT
  • CUTLASS
  • cuDNN
  • Kubernetes
  • Slurm
  • containerized applications
  • resource management
  • quantitative finance

What the JD emphasized

  • 8+ years of experience
  • Foundational understanding and interest of the machine learning lifecycle (training, evaluation, and inference).

Other signals

  • GPU accelerated workloads
  • deep learning training and inference
  • benchmarking models at scale
  • financial AI models