AI Frameworks Engineer

Intel Intel · Semiconductors · California, Santa Clara, United States +2

AI Frameworks Engineer at Intel focused on optimizing PyTorch, vLLM, and related AI framework software for Intel hardware. The role involves identifying performance bottlenecks, enabling key AI models (LLMs, generative AI, scientific AI), and profiling training/inference workloads to improve throughput, latency, and developer productivity.

What you'd actually do

  1. Identify performance bottlenecks and additional features necessary to run Argonne AI CoE workloads.
  2. Optimize PyTorch, vLLM, and related AI framework software for Intel CPUs, GPUs, and AI accelerators.
  3. Enable and optimize key AI models, including large language models, generative AI models, and scientific AI workloads.
  4. Profile AI training and inference workloads to identify issues across framework, runtime, kernel, and hardware layers.
  5. Collaborate with cross-functional teams to define technical specifications and software requirements.

Skills

Required

  • Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or STEM-related field with 3+ yrs. of experience in software development OR Master's degree with 1+ yrs. OR Ph.D. with 3+ months.
  • 3+ years of experience in AI framework development or optimization.
  • 3+ years of experience in PyTorch, vLLM, TensorFlow, Hugging Face, DeepSpeed, or related AI software frameworks.
  • 3+ years of experience in AI model development, enablement, profiling, or performance optimization.
  • 3+ years of experience in GPU or accelerator software development.
  • 3+ years of experience in Runtime, compiler, kernel, or backend optimization for AI workloads.

Nice to have

  • Advanced degree, Master's or PhD
  • Proficiency in Python, SYCL/CUDA, and C++ programming.
  • Experience developing in Linux environments.
  • Background in AI framework internals, model execution, or backend integration.
  • Experience with PyTorch, vLLM, Hugging Face Transformers, DeepSpeed, or similar AI frameworks.
  • Experience optimizing AI training or inference workloads for CPUs, GPUs, or accelerators.
  • Background in performance profiling, memory optimization, throughput improvement, or latency reduction.
  • Experience enabling or optimizing large language models, generative AI models, or scientific AI models.
  • Understanding of deep learning algorithms, model architectures, and AI workload patterns.
  • Strong analytical skills and ability to solve complex software challenges.
  • Passion for driving meaningful advancements in AI software and scientific computing.

What the JD emphasized

  • performance bottlenecks
  • optimize
  • Intel CPUs, GPUs, and AI accelerators
  • AI training and inference workloads
  • large language models
  • generative AI models
  • scientific AI workloads

Other signals

  • optimize AI software frameworks
  • performance bottlenecks
  • Intel CPUs, GPUs, and AI accelerators
  • large language models
  • generative AI models
  • scientific AI workloads
  • AI training and inference workloads