What you'd actually do

Maintain architectural coherence across a complex and wide surface area.

Own the load-bearing substrate through future growth and failure scenarios. The system must survive several multiples of current enterprise scale and every class of infrastructure event we can name.

Drive the structural decomposition of platform and domain boundaries, service boundaries across the microservice portfolio, agentic framework direction, auth and access-control architecture, and cross-cutting concerns like observability and rate limiting.

Bridge AI research and production infrastructure. Translate LLM and model requirements into efficient, production-grade services. Lead the agentic framework direction — memory architectures, tool orchestration, A2A patterns, semantic caching. Operate a GPU fleet at high duty cycle and understand dynamic batching, cross-session batching tradeoffs, and HPA calibration for GPU workloads.

Document architectural patterns and deviations. Write Architectural Design Records, RFCs, and breaking-points documents that stand alone for readers unfamiliar with the context. Document deviations from established patterns with their reasons — not as failures, but as data points.

Skills

Required

Go or Python
real-time streaming systems
offline/batch architectures
GPU inference infrastructure
LLM APIs
agent framework design
memory architectures
tool orchestration
observability
rate limiting
client-side implications across native mobile, web, SDK, and browser extension

Nice to have

healthcare-adjacent technology

What the JD emphasized

proven track record operating at an Architect, Staff, or Principal level

Expert-level proficiency in Go or Python

Production ownership of a real-time streaming system

Experience operating multi-TB storage systems with active OLTP traffic

Hands-on experience with offline/batch architectures

Experience designing and operating GPU inference infrastructure at scale

Experience integrating and scaling systems using LLM APIs or foundation models

What we want to accomplish and why we need you.

Suki is a leading technology company that provides AI voice solutions for healthcare. Its mission is to reimagine the healthcare technology stack, making it invisible and assistive to lift the administrative burden from clinicians. Its flagship product is Suki Assistant, an AI assistant that uses generative AI to automatically create clinical documentation by ambiently listening to patient-clinician conversations. Suki helps clinicians complete notes 72% faster on average, assists with other tasks including coding and answering questions, and generates incremental revenue for organizations, delivering a 9X ROI in year 1. Suki also offers its proprietary AI and speech platform, Suki Platform, to partners who want to create best-in-class ambient and voice experiences for their solutions. Clinicians that use Suki already spend over 70% less time on administrative tasks, and we’re striving to do even better. Come and join us!

We are a user-driven company and are committed to making sure every pixel of our product is in service of the doctors. We’re a team of technologists, clinicians, and industry experts working together to push the limits on technology used in medicine. We’re confident enough to move fast and talented enough not to break things.

What will you do every day?

We are looking for a Software Architect to join our Engineering team. Suki's architecture practice is at an inflection point. We have sustained significant CCU growth, completed a demanding OLTP Database scaling journey, leveraged Redis Streams based service orchestration, deployed self-hosted ML models, built agent-driven workflows, launched an Offline Data Platform, and maintained a client surface spanning iOS, Android, Web, SDKs, and a Chrome extension — with on-device ML capabilities across all of them.

The Architect is the person who holds institutional memory and architectural coherence across this surface area — so that decisions made in isolation don't quietly contradict each other, patterns are reused rather than reinvented, and deviations from established patterns are documented, not silently absorbed. The work ahead includes enhanced agentic capabilities, continued multi-region expansion, investments on stream-based/offline data processing, and a growing ML inference footprint.

On a day to day basis, you will:

Maintain architectural coherence across a complex and wide surface area.
Own the load-bearing substrate through future growth and failure scenarios. The system must survive several multiples of current enterprise scale and every class of infrastructure event we can name.
Drive the structural decomposition of platform and domain boundaries, service boundaries across the microservice portfolio, agentic framework direction, auth and access-control architecture, and cross-cutting concerns like observability and rate limiting.
Bridge AI research and production infrastructure. Translate LLM and model requirements into efficient, production-grade services. Lead the agentic framework direction — memory architectures, tool orchestration, A2A patterns, semantic caching. Operate a GPU fleet at high duty cycle and understand dynamic batching, cross-session batching tradeoffs, and HPA calibration for GPU workloads.
Document architectural patterns and deviations. Write Architectural Design Records, RFCs, and breaking-points documents that stand alone for readers unfamiliar with the context. Document deviations from established patterns with their reasons — not as failures, but as data points.
Remain hands-on — stay connected to the craft, read and write code. Prototype new architectures. Review pull requests for patterns that matter. Implement the hard parts of a new pattern yourself when it matters. Lead resolution of the most critical incidents.
Mentor and elevate the engineering organization. Act as a technical multiplier for senior engineers across backend, data, and ML disciplines. Hold a room, defend a position under pushback without collapsing into agreement, and adjust register for engineering vs. product vs. leadership audiences. Foster a culture of deep technical understanding, not just pattern-matching.
Represent Suki’s architecture and engineering both in internal and external forums through technical blogs, conferences and meetups

**QUALIFICATIONS **

Bachelor's or Master's degree in Computer Science or a related field, or equivalent production experience
10+ years of professional software engineering experience, with a proven track record operating at an Architect, Staff, or Principal level
Expert-level proficiency in Go or Python, with the ability to lead architectural decisions and remain hands-on in code
Production ownership of a real-time streaming system (Redis Streams, Kafka, Pulsar, or equivalent) at scale — including personal pager experience and a specific debugging story for a duplicate-processing or lost-message bug
Experience operating multi-TB storage systems with active OLTP traffic
Hands-on experience with offline/batch architectures: Airflow or equivalent orchestration, columnar formats (Iceberg, Delta, or Hudi), and warehouse-scale query engines (BigQuery, Snowflake, or Redshift)
Experience designing and operating GPU inference infrastructure at scale, including dynamic batching, Triton/TensorRT, and HPA calibration for GPU workloads — or equivalent ML inference platform depth
Experience integrating and scaling systems using LLM APIs or foundation models, with opinions about agent framework design, memory architectures, and tool orchestration
Demonstrated experience navigating product-vs-platform decomposition in a real organization — including having been wrong, and articulating what the reversing signal was
Proven ability to reason about client-side implications of backend decisions across native mobile, web, SDK, and browser extension surfaces
Familiarity with healthcare-adjacent regulation (HIPAA, PIPEDA, PHIPA, SOC2, or HITRUST) as a design constraint, not just a compliance requirement, will be a strong plus
Strong experience with GCP, Kubernetes, Docker, gRPC, and Protocol Buffers; multi-region architecture and data residency design will be a strong plus

What We're Looking For

Distinguished Technical Expertise: You are a recognized expert with a proven history of architecting, building, and operating complex systems with a strong software engineering talent.
Strategic, Systems-Level Thinker: You look beyond immediate requirements to anticipate future challenges. You can navigate deep ambiguity, identify long-term risks, and make high-judgment decisions that balance technical purity with business velocity.
Pragmatic and Action-Oriented: You have a bias for action and a data-driven approach to problem-solving. You understand the entire lifecycle of an ML-powered feature, from data to deployment, and know how to ship robust solutions quickly.
Leadership Through Influence: You have a demonstrated ability to lead complex, cross-functional technical initiatives without direct authority. You build

Tell me more about Suki

On a roll: Named by Fast Company as one of the most innovative companies, named Google’s Partner of the Year for AI/ML, named by Forbes as one of the top 50 companies in AI .
**Great team: **Founded, managed, and backed by successful tech veterans from Google and Apple and medical leaders from UCSF and Stanford. We have technologists and doctors working side-by-side to solve complex problems.
**Great investors: **We’re backed by Venrock, First Round Capital, Flare Capital, March Capital, Hedosophia and others. With our $165M raised so far, we have the resources to scale.
Huge market: Disrupting a massive, growing $30+ billion market for transcription, dictation, and order-entry solutions. Our vision is to become the voice user interface for healthcare, relieving the administrative burden on doctors instead of adding to it.
Great customers: Our solutions are used in health systems and clinics across the country, supporting clinicians across dozens of specialties. Check out what one of ourusers says about how Suki has helped his practice.
**Impact: **You’ll make an impact from day one. You’ll join a team working towards a shared purpose with a culture built upon deep empathy for doctors and passion for making their lives better.

Suki is an Equal Opportunity Employer. We are dedicated to building a company that fosters inclusion and belonging and reflects the diverse communities we serve across the country. We know we are stronger this way, and we look forward to growing our team with these shared values.