What you'd actually do

Lead research and development efforts focused on LLM training and inference optimization.

Train, evaluate, and optimize state-of-the-art AI models on Tenstorrent hardware.

Improve performance through techniques such as speculative decoding, quantization, kernel fusion, flash attention, and distributed training.

Investigate system bottlenecks and collaborate cross-functionally to drive performance improvements.

Translate cutting-edge ML research into scalable, production-ready solutions.

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Tenstorrent is building next-generation AI systems that push the boundaries of model training, inference, and large-scale distributed compute. The ML Models team sits at the intersection of cutting-edge AI research and high-performance hardware, bringing state-of-the-art machine learning models to life on Tenstorrent’s custom AI accelerators. From training large language models to optimizing inference performance at scale, this team works across the full stack to turn breakthrough research into production-ready AI systems. If you are passionate about advancing the frontier of AI research, inference and training optimizations, this is an opportunity to shape how future AI models are developed and deployed.

This role is hybrid, based out of Toronto, ON and Boston, MA.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

Strong Python and PyTorch experience developing and training deep learning models.
Deep understanding of ML architectures, LLM training, and inference optimization.
Hands-on experience training large-scale machine learning models.
4+ years of industry and/or academic experience in ML research and LLM development.
PhD, published research, or experience with speculative decoding is highly valued.

What We Need

Lead research and development efforts focused on LLM training and inference optimization.
Train, evaluate, and optimize state-of-the-art AI models on Tenstorrent hardware.
Improve performance through techniques such as speculative decoding, quantization, kernel fusion, flash attention, and distributed training.
Investigate system bottlenecks and collaborate cross-functionally to drive performance improvements.
Translate cutting-edge ML research into scalable, production-ready solutions.

What You Will Learn

How to optimize AI models on custom AI accelerators from application to silicon.
How large-scale ML systems are deployed, tuned, and scaled in production.
How hardware, compiler, kernel, and ML teams collaborate to maximize performance.
The challenges and tradeoffs of scaling modern AI workloads across custom hardware.

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

This role is hybrid, based out of Toronto, ON and Boston, MA.

Who You Are

Strong Python and PyTorch experience developing and training deep learning models.
Deep understanding of ML architectures, LLM training, and inference optimization.
Hands-on experience training large-scale machine learning models.
4+ years of industry and/or academic experience in ML research and LLM development.
PhD, published research, or experience with speculative decoding is highly valued.

What We Need

Lead research and development efforts focused on LLM training and inference optimization.
Train, evaluate, and optimize state-of-the-art AI models on Tenstorrent hardware.
Improve performance through techniques such as speculative decoding, quantization, kernel fusion, flash attention, and distributed training.
Investigate system bottlenecks and collaborate cross-functionally to drive performance improvements.
Translate cutting-edge ML research into scalable, production-ready solutions.

What You Will Learn

How to optimize AI models on custom AI accelerators from application to silicon.
How large-scale ML systems are deployed, tuned, and scaled in production.
How hardware, compiler, kernel, and ML teams collaborate to maximize performance.
The challenges and tradeoffs of scaling modern AI workloads across custom hardware.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

Model Research, Optimization, and Training

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

Other signals