Staff Python / Pytorch Developer — Frontend Inference Compiler – Dubai

Cerebras Cerebras · Semiconductors · United Arab Emirates · Software

Staff Python/PyTorch Developer for Frontend Inference Compiler at Cerebras, focusing on optimizing generative AI models for their wafer-scale AI chip. Responsibilities include developing compiler infrastructure, analyzing new models, and improving inference performance.

What you'd actually do

  1. Analysis of new models from generative AI field and understanding of impacts on compilation stack
  2. Develop and maintain model definition framework that consists of model building blocks to represent large language models based on PyTorch and Cerebras dialects ready to be deployed on Cerebras hardware.
  3. Develop and maintain the frontend compiler infrastructure that ingests PyTorch models and produces an intermediate representation (IR).
  4. Extend and optimize PyTorch FX / TorchScript / TorchDynamo-based tooling for graph capture, transformation, and analysis.
  5. Research on new methods for model optimization to improve Cerebras inference

Skills

Required

  • Python
  • PyTorch internals
  • computational graphs
  • tensor operations
  • model tracing
  • compilers
  • interpreters
  • ML graph optimization frameworks
  • PyTorch
  • HuggingFace Transformers library
  • Large Language Models
  • Transformer architecture
  • C++
  • MLIR

Nice to have

  • PyTorch
  • TensorFlow XLA
  • TVM
  • ONNX RT
  • hardware accelerators
  • quantization
  • runtime scheduling
  • multi-target inference compilation
  • numerical precision trade-offs
  • operator lowering
  • open-source ML compiler projects

What the JD emphasized

  • frontend compiler infrastructure
  • PyTorch FX / TorchScript / TorchDynamo

Other signals

  • fastest Generative AI inference solution
  • optimize for the Cerebras inference platform
  • frontend compiler infrastructure
  • PyTorch FX / TorchScript / TorchDynamo-based tooling