Inference ML API Sdet

Cerebras Cerebras · Semiconductors · Headquarters +1 · Software

The role focuses on testing and validating AI/ML models, specifically for inference solutions on Cerebras' AI chip. The candidate will lead testing strategy, develop scalable tests and frameworks, and ensure the quality of software and hardware components, with a strong emphasis on performance and accuracy at scale. The role also involves driving automation and mentoring junior engineers.

What you'd actually do

  1. Architect and own end-to-end test strategies for new features, developing scalable tests, frameworks, and tooling to ensure quality.
  2. Lead contributions to industry-standard benchmarks and drive adoption of rigorous evaluation methodologies.
  3. Define and drive automation initiatives to significantly improve internal engineering efficiency and test coverage.
  4. Make strategic decisions around coverage trade-offs, resource requirements, and risk-based testing priorities.
  5. Serve as a technical anchor in a highly agile environment, adapting quickly to shifting priorities while maintaining quality standards.

Skills

Required

  • 5+ years of relevant industry experience in software integration, development, or quality engineering.
  • Deep expertise in automation and programming using one or more languages such as Python, C++, or Go; ability to design and build reusable test frameworks from the ground up.
  • Proven experience testing compute, machine learning, networking, or storage systems within large-scale enterprise environments.
  • Strong track record of debugging complex issues across distributed, scaled-out deployments.
  • Demonstrated ability to lead cross-functional quality initiatives involving product development, product management, customer operations, and field teams.
  • Excellent verbal and written communication skills, with experience presenting technical findings to both engineering and leadership audiences.
  • Strong organizational skills, ownership mindset, and ability to drive projects to completion independently.
  • Experience leading and mentoring engineers across geographically dispersed teams and time zones.

Nice to have

  • Hands-on experience with ML workloads including LLM and/or multimodal training or inference.
  • Deep familiarity with hardware architecture, performance optimizations, compilers, and ML frameworks.
  • Experience designing test strategies for distributed systems, cloud infrastructure, and security validation.
  • Experience with microservices deployment, debugging, and orchestration at scale.
  • Prior experience owning or significantly contributing to a team's quality engineering culture or test infrastructure.

What the JD emphasized

  • lead testing strategy and execution for AI/ML models
  • evaluating accuracy, fairness, and performance at scale
  • key technical leader in delivering and validating all software and hardware components for Cerebras API Features
  • own software components feature integration quality
  • drive pre-deployment and production validation for Cerebras inference solutions
  • define and champion best testing practices
  • establish robust debugging methodologies
  • mentor junior engineers
  • advocating for world-class product quality
  • Deep expertise in automation and programming
  • Proven experience testing compute, machine learning, networking, or storage systems within large-scale enterprise environments
  • Strong track record of debugging complex issues across distributed, scaled-out deployments
  • Demonstrated ability to lead cross-functional quality initiatives
  • Hands-on experience with ML workloads including LLM and/or multimodal training or inference.

Other signals

  • testing AI/ML models at scale
  • evaluating accuracy, fairness, and performance
  • validating software and hardware components for Cerebras inference solutions
  • driving pre-deployment and production validation
  • building quality systems that scale