Senior Developer Technology Engineer, Cpu Performance

NVIDIA NVIDIA · Semiconductors · Santa Clara, CA +3 · Remote

Seeking a Senior Developer Technology Engineer to research and develop techniques for optimizing large-scale applications on NVIDIA's CPU platforms, focusing on data-intensive workloads and heterogeneous computing systems. The role involves in-depth analysis, performance optimization, publishing findings, and influencing future hardware/software design.

What you'd actually do

  1. In this role, you will research and develop techniques to accelerate large scale applications running on NVIDIA’s family of advanced CPU platforms.
  2. Work directly with other technical experts in their fields (industry and academia) to perform in-depth analysis and optimization of complex database and data analytics workloads to ensure the best possible performance on modern hardware architecture focused on CPU performance.
  3. Publish and present discovered optimization techniques in developer blogs or relevant conferences to engage and educate the Developer community.
  4. Influence the design of next-generation hardware architectures, software, and programming models in collaboration with research, hardware, system software, libraries, and tools teams at NVIDIA

Skills

Required

  • Masters or PhD in Computer Science, Computer Engineering, or related computationally focused science degree (or equivalent experience).
  • At least 5+ years of relevant work or research experience.
  • Expert knowledge of modern CPU architectures (ARM, x86) and system/OS
  • In-depth expertise with CPU architecture fundamentals, especially memory subsystem (cache DRAM, storage.)
  • Hands-on experience with low-level parallel programming, vectorization, CPU intrinsics and concurrent data structures.
  • Programming fluency in modern C/C++ with a deep understanding of algorithms, concurrency, and other optimization techniques.
  • Good communication and organization skills, with a logical approach to problem solving, and prioritization skills.

Nice to have

  • Experience optimizing the performance of distributed database systems and frameworks (e.g. production database or Spark).
  • Background with compression, storage systems, networking, and distributed computer architectures.
  • Knowledge of GPU architectures.

What the JD emphasized

  • Expert knowledge of modern CPU architectures (ARM, x86) and system/OS
  • In-depth expertise with CPU architecture fundamentals, especially memory subsystem (cache DRAM, storage.)
  • Hands-on experience with low-level parallel programming, vectorization, CPU intrinsics and concurrent data structures.
  • Programming fluency in modern C/C++ with a deep understanding of algorithms, concurrency, and other optimization techniques.