Senior Software Engineer - App Foundations

Abnormal AI · Vertical AI · United States · Remote · Division Foundations

Senior Backend Engineer for App Foundations team at an AI-native company. This role focuses on owning core customer-centric data and platform services, including the LLM inference platform (LMS). Responsibilities include designing and operating high-performance services, leading end-to-end delivery of platform features, improving reliability and observability, contributing to the LLM inference platform (model routing, provider integrations, batch inference, safety rails), and building AI-native internal tools. The role emphasizes daily use of AI coding tools and shipping AI-powered solutions beyond IDEs. Requires strong backend engineering experience, system design, and experience with AI tools and LLM production environments.

What you'd actually do

  1. Design, build, and operate high-performance, low-latency services across the App Foundations portfolio — Python/Django, Go, MySQL/PostgreSQL, Kafka, Redis, and ElasticSearch — in close partnership with engineers, designers, PMs, and EMs across R&D.
  2. Lead end-to-end delivery of platform features and stability workstreams — from tech design and PRR through launch, post-launch ops, and long-term ownership.
  3. Raise the operational bar: drive SLO-backed reliability work on Tier-0 systems like Notifications and LMS, improve our observability and on-call posture, and reduce unplanned work on the systems you own.
  4. Contribute to our LLM inference platform (LMS) — model routing and fallback, provider integrations (Azure OpenAI, Bedrock), batch/async inference, cost attribution, and the safety rails that let product teams ship AI features without reinventing the stack.
  5. Build AI-native internal tools that change how the team operates: automated ticket triage, on-call copilots, post-mortem drafting, customer-request classifiers — following the precedent of our Nora workflow and existing Claude skills.

Skills

Required

  • 5+ years of backend software engineering experience building and operating production services at scale.
  • 4+ years on relevant tech stacks — Python (Django) and/or Go; MySQL / PostgreSQL; and hands-on experience with Kafka, Redis, and ElasticSearch.
  • 3+ years supporting enterprise-class customers in production, including ownership of SLOs, on-call, and incident response.
  • 2+ years of system design experience — demonstrated ability to design, document, and land non-trivial cross-team architectures.
  • Proven experience leading multi-quarter projects end-to-end: tech design, implementation, launch, and post-launch operations.
  • Demonstrated daily use of AI coding tools in your current workflow, and specific examples of where you’ve used AI to improve how you or your team works beyond just code generation.
  • Strong written and verbal communication; proven ability to collaborate autonomously and asynchronously with remote stakeholders.
  • Bachelor’s degree in Computer Science or equivalent professional experience.

Nice to have

  • Hands-on experience building with LLMs in production — provider APIs (OpenAI/Anthropic/Bedrock), model routing and fallback, evals, batch inference, or cost/quota tooling.
  • Experience with notification delivery systems, licensing/entitlement platforms, or other high-leverage internal platforms used by many product teams.
  • Experience building and shipping AI agents or automations (coding agents, ops agents, classifier pipelines) that replaced meaningful manual work.
  • Experience mentoring engineers and acting as a tech lead on at least a small squad.
  • Master’s degree in Computer Science or a related field.

What the JD emphasized

  • Demonstrated daily use of AI coding tools in your current workflow, and specific examples of where you’ve used AI to improve how you or your team works beyond just code generation.
  • Have built or shipped something AI-powered beyond the IDE — a script, an agent, a workflow automation, a production feature — that actually changed how work got done.
  • Think at the workflow level, not just the keystroke level: you look at an ops process, a support queue, or a team ritual and ask “what should an agent own here?”

Other signals

  • AI-native company
  • LLM inference platform
  • AI-native internal tools
  • AI coding tools as default