Staff Software Engineer, Platform

Scale AI Scale AI · Data AI · New York, NY +1 · Horizontals EPD

Scale AI is seeking a Staff Software Engineer to drive the architectural roadmap and implementation of core platforms and software systems for their AI data infrastructure. This role involves defining high-level vision and driving adoption for orchestration, data abstraction, data pipelines, identity & access management, and cloud infrastructure, with exposure to the forefront of the AI race.

What you'd actually do

  1. Architectural Vision: You will drive the design and implementation of foundational systems, acting as a bridge between high-level business goals and technical goals.
  2. Cross-Functional Leadership: You will collaborate with cross-functional teams to define and drive adoption of the next generation of features for our AI data infrastructure.
  3. Technical Ownership: You are responsible for proactively identifying and driving opportunities for organizational growth, driving improvements in programming practices, and upgrading the tools that define our development lifecycle.
  4. Technical Mentorship: You will serve as a subject matter expert, presenting technical information to stakeholders and providing the guidance to elevate the engineering culture across the company.

Skills

Required

  • back-end systems
  • distributed systems
  • public cloud platforms (AWS)
  • software development
  • Kubernetes
  • Terraform
  • Docker
  • orchestration platforms
  • Temporal
  • AWS Step Functions
  • NoSQL document databases (MongoDB)
  • structured databases (Postgres)
  • software engineering best practices
  • CI/CD tooling (CircleCI, ArgoCD)

Nice to have

  • data warehouses (Snowflake, Firebolt)
  • data pipeline/ETL tools (Dagster, dbt)
  • scaling products at hyper-growth startups
  • AI technologies

What the JD emphasized

  • 8+ years of full-time engineering experience, post-graduation with specialities in back-end systems.
  • Extensive experience in software development and a deep understanding of distributed systems and public cloud platforms (AWS preferred).
  • Demonstrated a track record of independent ownership and leadership across successful multi-team engineering projects
  • Experience working fluently with standard containerization & deployment technologies like Kubernetes, Terraform, Docker, etc.
  • Experience with orchestration platforms, such as Temporal and AWS Step Functions.

Other signals

  • building LLMs at billion scale
  • human evaluation and reinforcement learning through human feedback (RLHF)
  • Generative AI Data Engine
  • SGP
  • Donovan
  • RLHF
  • human data generation
  • model evaluation
  • safety
  • alignment
  • AI data infrastructure
  • AI race