What you'd actually do

Design and optimize high-throughput, low-latency inference systems.

Write and maintain high-performance GPU kernels using Triton or CUDA to accelerate custom model layers and critical workloads.

Conduct deep performance analysis using tools such as PyTorch Profiler and NVIDIA Nsight to identify bottlenecks in compute, memory, and communication.

Partner with infrastructure teams to design scalable and reliable distributed serving systems across heterogeneous hardware environments (e.g., A100, H100, B200, CPU).

Establish and track efficiency metrics such as cost per million inferences.

Skills

Required

Python
C++
distributed inference
GPU architecture
performance profiling
inference serving workloads
large-scale inference
distributed frameworks
runtime systems
inference compilation
optimization tools
system-level performance tradeoffs
compute
memory
I/O subsystems
benchmarking
system efficiency
scalability
reliability

Nice to have

Triton
CUDA
TensorRT
ONNX Runtime
AOTI
operator fusion
graph-level optimization
PyTorch Profiler
NVIDIA Nsight
CUDA tooling
NCCL
Docker
Kubernetes
Transformers
multimodal models
Mixture-of-Experts (MoE)
Diffusion Transformers (DiT)

The Opportunity

Photoshop ART is seeking a Senior researcher - Machine Learning Systems & Efficiency Engineer to join our R&D team focused on delivering practical, production-ready improvements in inference performance, latency, and cost efficiency across image editing applications. This role sits at the intersection of model architecture, systems, inference runtimes, and services, with a clear mandate: deliver high-quality ML systems at substantially lower cost and higher efficiency. Individuals in this role are expected to have deep expertise in areas such as Artificial Intelligence (AI), ML systems, and computer vision. Strong preference will be given to candidates with experience in distributed inference, multimodal model profiling, and performance optimization. You will work closely with research, product, and infrastructure teams to influence model design decisions, improve GPU utilization, and build scalable, cost-aware ML systems deployed in production.

This is a hands-on, high-leverage role where a single engineer can drive outsized impact, potentially saving millions of dollars in compute costs. The ideal candidate will have a strong interest in developing practical innovations that advance Adobe products.

Job Responsibilities

Inference & Serving Optimization: Design and optimize high-throughput, low-latency inference systems. Optimize model architectures to improve deployment and runtime efficiency using techniques such as distillation, pruning, quantization, and Mixture-of-Experts (MoE). Implement advanced serving strategies including batching, caching (KV, semantic, embedding), quantization (FP8/INT8), and distributed inference strategies including data, tensor, pipeline, expert, and hybrid parallelism, with a focus on balancing computation and communication efficiency. Explore training or fine-tuning approaches when they directly lead to more efficient inference, simpler deployment, or improved runtime performance.
Kernel Development & System Acceleration: Write and maintain high-performance GPU kernels using Triton or CUDA to accelerate custom model layers and critical workloads. Improve GPU utilization through kernel fusion, asynchronous pipelines, and optimized scheduling strategies.
Performance Profiling & System Optimization: Conduct deep performance analysis using tools such as PyTorch Profiler and NVIDIA Nsight to identify bottlenecks in compute, memory, and communication. Optimize end-to-end system performance across inference workloads.
Distributed Systems & Infrastructure Collaboration: Partner with infrastructure teams to design scalable and reliable distributed serving systems across heterogeneous hardware environments (e.g., A100, H100, B200, CPU). Contribute to resource scheduling, GPU pooling, and elastic workload management.
Cost-Aware ML Engineering: Establish and track efficiency metrics such as cost per million inferences. Build benchmarking frameworks and dashboards to guide tradeoffs among quality, latency, and compute cost, enabling data-driven system and product decisions.
Technical Leadership & Best Practices: Serve as a trusted technical advisor to research and product teams on efficiency tradeoffs. Define best practices for scalable and cost-efficient ML development and mentor engineers on performance-oriented systems design.

What You’ll Need to Succeed

Education: Master’s or PhD in Computer Science, Electrical Engineering, or a related field, with a focus on machine learning systems, distributed systems, or high-performance computing.
**Distributed Inference & Serving Expertise: **Hands-on experience implementing and scaling large-scale inference or serving workloads using distributed frameworks and runtime systems (e.g., Triton, vLLM, SGLang, xDiT, or similar). Experience applying inference compilation and optimization tools (e.g., TensorRT, ONNX Runtime, AOTI), including techniques such as operator fusion and graph-level optimization, with a strong understanding of system-level performance tradeoffs.
GPU & Performance Engineering Skills: Strong understanding of GPU architecture (e.g., memory hierarchy, compute throughput, communication bandwidth) and practical experience diagnosing performance bottlenecks across compute, memory, and I/O subsystems.
Programming & Systems Development: Proficiency in Python and C++, with experience building high-performance or distributed systems. Familiarity with CUDA or Triton for performance-critical workloads is highly desirable.
Data-Driven Engineering Mindset: Demonstrated ability to make engineering decisions based on rigorous measurement and benchmarking, with a focus on improving system efficiency, scalability, and reliability in production environments.

Preferred Experience

ML Frameworks & Tooling: Experience contributing to or maintaining performance- or efficiency-focused libraries or systems. Hands-on experience with:
- Open-source serving frameworks (e.g., vLLM, SGLang, xDiT, or similar)
- Inference compilation tools (e.g., TensorRT, Triton, AOTI, or equivalent, operation fusion, or graph-level optimization)
- GPU profiling and performance analysis tools (e.g., PyTorch Profiler, NVIDIA Nsight, CUDA tooling)
Distributed Systems & Communication: Exposure to low-level communication libraries such as NCCL and a practical understanding of collective operations (e.g., AllReduce, AllGather) in large-scale distributed serving environments.
Containerization & Cluster Operations: Familiarity with containerized workflows (Docker, Kubernetes) and job scheduling in headless Linux environments, including experience operating production ML workloads on shared GPU clusters.
Model Architectures: Working knowledge of model architectures such as Transformers, multimodal models, Mixture-of-Experts (MoE), or Diffusion Transformers (DiT).

About Adobe

Adobe empowers everyone to create through innovative platforms and tools that unleash creativity, productivity and personalized customer experiences. Adobe’s industry-leading offerings including Adobe Acrobat Studio, Adobe Express, Adobe Firefly, Creative Cloud, Adobe Experience Platform, Adobe Experience Manager, and GenStudio enable people and businesses to turn ideas into impact, powered by AI and driven by human ingenuity.

Our 30,000+ employees worldwide are creating the future and raising the bar as we drive the next decade of growth. We’re on a mission to hire the very best and believe in creating a company culture where all employees are empowered to make an impact. At Adobe, we believe that great ideas can come from anywhere in the organization. The next big idea could be yours.

** Let’s Adobe together**

At Adobe, we believe in creating a company culture where all employees are empowered to make an impact. Learn more about Adobe life, including our values and culture, focus on people, purpose and community, Adobe for All, comprehensive benefits programs, the stories we tell, the customers we serve, and how you can help us advance our mission of empowering everyone to create.

Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic. Learn more.

Adobe aims to make our Careers website and recruiting process accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email accommodations@adobe.com or call +1 408-536-3015.

AI Use Guidelines for Interviews: Our interviews are designed to reflect your own skills and thinking. The use of AI or recording tools during live interviews is not permitted unless explicitly invited by the interviewer or approved in advance as part of a reasonable accommodation. If these tools are used inappropriately or in a way that misrepresents your work, your application may not move forward in the process.

At Adobe, we empower employees to innovate with AI — and we look for candidates eager to do the same. As part of the hiring experience, we provide clear guidance on where AI is encouraged during the process and where it’s restricted during live interviews. See how we think about AI in the hiring experience.

Expected Pay Range:

Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $142,700 -- $270,950 annually. Pay within this range varies by work location and may also depend on job-related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.

In California, the pay range for this position is $187,100 - $270,950 In Washington, the pay range for this position is $168,600 - $244,200

At Adobe, for sales roles starting salaries are expressed as total target compensation (TTC = base + commission), and short-term incentives are in the form of sales commission plans. Non-sales roles starting salaries are expressed as base salary and short-term incentives are in the form of the Annual Incentive Plan (AIP).

In addition, certain roles may be eligible for long-term incentives in the form of a new hire equity award.

State-Specific Notices:

California:

Fair Chance Ordinances

Adobe will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and “fair chance” ordinances.

Colorado:

Application Window Notice

If this role is open to hiring in Colorado (as listed on the job posting), the application window will remain open until at least the date and time stated above in Pacific Time, in compliance with Colorado pay transparency regulations. If this role does not have Colorado listed as a hiring location, no specific application window applies, and the posting may close at any time based on hiring needs.

Massachusetts:

Massachusetts Legal Notice

It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.