Staff Software Engineer, Data Platform

Scale AI Scale AI · Data AI · New York, NY +1 · Horizontals EPD

Scale AI is seeking a Staff Software Engineer to lead the design and development of core data storage, streaming, caching, and indexing platforms. This role is crucial for supporting the company's AI products, including those involved in RLHF and model evaluation, by ensuring the reliability and scalability of their data infrastructure.

What you'd actually do

  1. Drive the architecture, design, implementation, and reliability of our foundational data platforms and systems, working closely with stakeholders and internal customers to understand and refine requirements.
  2. Collaborate with cross-functional teams to define, design, and deliver new features.
  3. Proactively identify opportunities for, and driving improvements to, current programming practices, including process enhancements and tool upgrades.
  4. Present technical information to teams and stakeholders, providing guidance and insight on development processes and technologies.
  5. Provide technical leadership, including: upholding and upleveling engineering standards across the organization, mentoring junior engineers.

Skills

Required

  • 8+ years of full-time engineering experience
  • Specialties in back-end systems
  • Building large-scale data storage, streaming, and warehousing systems
  • Experience in various database technologies (MongoDB, Postgres)
  • Experience in streaming/processing solutions (Kinesis, Flink, Spark)
  • Experience in indexing/caching (ElasticSearch, Redis)
  • Experience with data query engines (Trino, Presto, Snowflake, etc.)
  • Mentoring and leading teams
  • Communication and collaboration skills
  • Translate complex technical concepts to non-technical stakeholders
  • Experience working fluently with standard containerization & deployment technologies like Kubernetes
  • Experience with various public cloud offerings
  • Extensive experience in software development
  • Deep understanding of distributed systems
  • Deep understanding of cloud platforms
  • Deep understanding of data systems
  • Experience driving cross functional collaboration and communication at an organizational or broader level

Nice to have

  • Strong knowledge of software engineering best practices
  • CI/CD tooling (CircleCI)
  • Performance tuning and cost optimizations of cloud based data platforms
  • Experience defining a data lifecycle strategy
  • Designing/implementing tooling for data privacy (i.e. GDPR) needs
  • Experience scaling products at hyper-growth startups
  • Excitement to work with AI technologies

What the JD emphasized

  • 8+ years of full-time engineering experience, post-graduation with specialties in back-end systems, specifically related to building large-scale data storage, streaming, and warehousing systems.
  • Extensive experience in various database technologies (MongoDB, Postgres), streaming/processing solutions (Kinesis, Flink, Spark), indexing/caching (ElasticSearch, Redis), and various data query engines (Trino, Presto, Snowflake, etc.).
  • Show a track record of mentoring and leading teams in successful projects.