Edge ML Software Engineer (system Modeling-pico) - San Jose

ByteDance · Big Tech · San Jose, CA · R&D

Develop transaction-level models of edge NPU architectures for ML workloads (CNNs, Transformers) to simulate execution, analyze performance, and optimize for latency, memory, and power targets. Requires strong C/C++ and System C proficiency, computer architecture understanding, and experience with ML accelerator modeling.

What you'd actually do

Develop transaction-level models of edge NPU architectures with bit accuracy and approximate timing/power accuracy, which model compute units, interconnects, memory hierarchies, and data movements, to simulate execution of representative ML workloads of both CNNs and Transformers.
Analyze ML model compute intensity, memory footprint and bandwidth requirements, and operator-level latency.
Collaborate with compiler and algorithm teams to optimize ML workload for latency, memory and power targets.

Skills

Required

Computer architecture
Electronic system modeling
C/C++
System C
ML workloads analysis

Nice to have

Performance and power models for ML accelerators
Performance bottleneck analysis
Architectural recommendations

What the JD emphasized

ML workloads
edge NPU architectures
ML accelerators

Other signals

ML workloads
NPU architectures
edge ML

Read full job description

As a world-renowned VR/AR brand with independent innovation and R&D capabilities, PICO has been at the forefront of the consumer electronic market. We have teams in Europe, Japan and South Korea. Now we are looking for experts in image pipeline to join us to build our AR/VR imaging team.

Responsibilities:

Develop transaction-level models of edge NPU architectures with bit accuracy and approximate timing/power accuracy, which model compute units, interconnects, memory hierarchies, and data movements, to simulate execution of representative ML workloads of both CNNs and Transformers.
Analyze ML model compute intensity, memory footprint and bandwidth requirements, and operator-level latency.
Collaborate with compiler and algorithm teams to optimize ML workload for latency, memory and power targets.

Requirements

Minimum Qualifications

Bachelor's degree in Computer Science, Electrical Engineering, Computer Engineering, or a related field, or equivalent practical experience.
3+ years of industry experience in computer architecture and electronic system modeling.
Strong understanding of computer architecture concepts: memory, cache, DMA, tiling, vectorization, systolic array, etc.
Strong C/C++ and System C proficiency.

Preferred Qualifications

5+ years of relevant industry experience.
Experience building analytical performance and power models for ML accelerators.
Experience analyzing performance bottlenecks and generating actionable architectural recommendations.