What you'd actually do

Optimizing key use cases and models, as well as debugging and resolving issues related to accuracy and memory management.

Designing and developing model deployment frameworks, such as leveraging new features in vLLM to accelerate inference.

Developing and debugging high-performance Kernels specifically for INTEL GPU/CPU.

Engaging in deep technical syncs with architects and peers to iterate on solutions and provide progress transparency.

Transforming innovative ideas into production-ready features.

Job Details:

Job Description:

Artificial Intelligence (AI) is transforming our lives and becoming ubiquitous. INTEL is at the heart of this revolution. We provide a robust software stack that seamlessly integrates into the frameworks used by millions of end-users worldwide. As our AI Engineering team in Shanghai continues to expand, we are looking for a passionate Graduate Technical Intern to help us deliver high-performance, high-quality deep learning solutions.

Your responsibilities will include:

Performance Optimization: Optimizing key use cases and models, as well as debugging and resolving issues related to accuracy and memory management.
Deployment Architecture: Designing and developing model deployment frameworks, such as leveraging new features in vLLM to accelerate inference.
Kernel Development: Developing and debugging high-performance Kernels specifically for INTEL GPU/CPU.
Architectural Alignment: Engaging in deep technical syncs with architects and peers to iterate on solutions and provide progress transparency.
Innovation: Transforming innovative ideas into production-ready features.

Qualifications:

Current Master’s or Ph.D. student in Computer Science, Artificial Intelligence, Software Engineering, or related fields.
Strong proficiency in C++ and Python programming.
Solid understanding of Deep Learning fundamentals with proven practical experience.
Preferred Qualifications:
- Experience with LLMs, Multimodal models, or Agents, with a deep understanding of model architectures.
- Familiarity with PyTorch and high-performance inference frameworks like vLLM.
- Hands-on experience in GPU Kernel development (e.g., CUDA/Triton).
- Availability: Minimum 4 days per week with a commitment of 6 months or longer.

Job Type:

Student / Intern

Shift:

Shift 1 (China)

Primary Location:

PRC, Shanghai

Additional Locations:

Business group:

The Sales and Marketing Group (SMG) leverages the product portfolio to drive Intel's revenue growth and market expansion, blending strategic initiatives with dynamic sales efforts to capture and retain customers. SMG is responsible for empowering the sales force with tools and insights needed to close deals and build lasting customer relationships. Sales analytics and market research ensure strategies are both targeted and impactful. In SMG, disciplined execution, creativity, and ambition are celebrated, providing ample opportunities for career advancement and skill development.

Posting Statement:

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

Position of Trust

N/A

Work Model for this Role

This role will require an on-site presence. * Job posting details (such as work model, location or time type) are subject to change.

ADDITIONAL INFORMATION: Intel is committed to Responsible Business Alliance (RBA) compliance and ethical hiring practices. We do not charge any fees during our hiring process. Candidates should never be required to pay recruitment fees, medical examination fees, or any other charges as a condition of employment. If you are asked to pay any fees during our hiring process, please report this immediately to your recruiter.

Job Description:

Your responsibilities will include:

Performance Optimization: Optimizing key use cases and models, as well as debugging and resolving issues related to accuracy and memory management.

Deployment Architecture: Designing and developing model deployment frameworks, such as leveraging new features in vLLM to accelerate inference.

Kernel Development: Developing and debugging high-performance Kernels specifically for INTEL GPU/CPU.

Architectural Alignment: Engaging in deep technical syncs with architects and peers to iterate on solutions and provide progress transparency.

Innovation: Transforming innovative ideas into production-ready features.

Qualifications:

Current Master’s or Ph.D. student in Computer Science, Artificial Intelligence, Software Engineering, or related fields.

Strong proficiency in C++ and Python programming.

Solid understanding of Deep Learning fundamentals with proven practical experience.

Preferred Qualifications:

Experience with LLMs, Multimodal models, or Agents, with a deep understanding of model architectures.
Familiarity with PyTorch and high-performance inference frameworks like vLLM.
Hands-on experience in GPU Kernel development (e.g., CUDA/Triton).
Availability: Minimum 4 days per week with a commitment of 6 months or longer.

Business group:

Posting Statement:

Position of Trust

N/A

Work Model for this Role

This role will require an on-site presence. * Job posting details (such as work model, location or time type) are subject to change.

ADDITIONAL INFORMATION: Intel is committed to Responsible Business Alliance (RBA) compliance and ethical hiring practices. We do not charge any fees during our hiring process. Candidates should never be required to pay recruitment fees, medical examination fees, or any other charges as a condition of employment. If you are asked to pay any fees during our hiring process, please report this immediately to your recruiter.

Workload Optimization Intern

What you'd actually do

Skills

Required

Nice to have

Other signals

Job Details:

Job Description:

Performance Optimization: Optimizing key use cases and models, as well as debugging and resolving issues related to accuracy and memory management.

Deployment Architecture: Designing and developing model deployment frameworks, such as leveraging new features in vLLM to accelerate inference.

Kernel Development: Developing and debugging high-performance Kernels specifically for INTEL GPU/CPU.

Architectural Alignment: Engaging in deep technical syncs with architects and peers to iterate on solutions and provide progress transparency.

Innovation: Transforming innovative ideas into production-ready features.

Qualifications:

Job Type:

Shift:

Primary Location:

Additional Locations:

Business group:

Posting Statement:

Position of Trust

Job Details:

Job Description:

Performance Optimization: Optimizing key use cases and models, as well as debugging and resolving issues related to accuracy and memory management.

Deployment Architecture: Designing and developing model deployment frameworks, such as leveraging new features in vLLM to accelerate inference.

Kernel Development: Developing and debugging high-performance Kernels specifically for INTEL GPU/CPU.

Architectural Alignment: Engaging in deep technical syncs with architects and peers to iterate on solutions and provide progress transparency.

Innovation: Transforming innovative ideas into production-ready features.

Qualifications:

Job Type:

Shift:

Primary Location:

Additional Locations:

Business group:

Posting Statement:

Position of Trust