Principal Software Engineer - Tech Lead

Johnson & Johnson Johnson & Johnson · Pharma · Santa Clara, CA +1

Principal Software Engineer - Tech Lead for a MedTech company's AI platform, focusing on building scalable APIs, SDKs, CLIs, and UIs to support surgical AI development. The role involves owning the platform as a product, building shared ML infrastructure (serving, registry, orchestrator, training/inference control planes), designing developer experiences, and optimizing performance, safety, and cost for training and inference. Requires strong system design, API design, and end-to-end ML lifecycle experience.

What you'd actually do

  1. Bring technical ownership of a platform as a product and ship scalable APIs, SDKs, CLIs, and UIs that make ML platform easy to adopt and cater specifically to accelerate surgical AI.
  2. Build shared ML infra with the platform infrastructure team and principal engineers for the infrastructure team to develop/extend core components: Serving Layer, Model Registry, Pipeline Orchestrator, and Training/Inference control planes.
  3. Design great developer experiences with defined templates, golden paths, opinionated defaults, and clear documentation to reduce time to first production.
  4. Develop instrumentation to measure adoption, friction, reliability, and cost; use data to prioritize roadmap and validate outcomes.
  5. Partner across organizations (ML Engineering, Data Science, Infra/SRE, Security, Finance) to optimize performance, safety, and spend, especially for GPU intensive training and highQPS inference.

Skills

Required

  • system design experience
  • API design
  • developer experience
  • gRPC/REST APIs
  • SDKs
  • CLIs
  • simple UIs
  • end-to-end ML model lifecycle
  • model serving
  • training
  • model CI/CD
  • GPU resources management

Nice to have

  • Hands on ML experience

What the JD emphasized

  • push the boundaries for surgical AI
  • ML Platform is central to scaling training and inference
  • build primitives and foundational capabilities
  • automating experimentation to production workflows
  • treats platform as a product
  • turn complex ML/AI workflows into clear, durable APIs and easy-to-use CLIs/UIs
  • Hands on ML experience is a plus
  • end-to-end ML model lifecycle
  • built ML platform features that are delightful to use
  • Seasoned coder
  • ship product
  • obsess about refining developer experience and reducing friction
  • passionate about supporting ML engineers
  • translating them into clean, durable platform abstractions
  • passionate about infrastructure as code
  • automating painful manual processes
  • advocate for platform solutions
  • clear communicator

Other signals

  • ML Platform is central to scaling training and inference
  • build primitives and foundational capabilities that would let internal and external teams train, evaluate, deploy, operate and monitor models quickly and compliantly
  • automating experimentation to production workflows
  • treats platform as a product
  • turn complex ML/AI workflows into clear, durable APIs and easy-to-use CLIs/UIs