Developer Productivity

LangChain LangChain · Data AI · San Francisco, CA · Engineering

LangChain is seeking a Software Engineer for their Infrastructure team to own developer productivity for their LangGraph Cloud and LangSmith products. The role involves ensuring reliability, scalability, and quality for Kubernetes-based services, APIs, and UI flows, with a focus on pioneering quality practices for LLM applications like prompt regression testing and evaluation suites. The engineer will be responsible for the end-to-end test strategy, setting up test environments, improving CI/CD pipelines, building observability into testing, establishing performance baselines, and partnering on incident workflows.

What you'd actually do

  1. Own test strategy end-to-end across APIs, services, UI, data, and infrastructure (Kubernetes, Terraform, Helm)
  2. Stand up ephemeral test environments in Kubernetes for pull requests and release candidates; seed test data and run hermetic test suites
  3. Shift quality earlier in CI/CD pipelines (GitHub Actions) through parallelization, caching, deterministic seeds, flake tracking, and quality gates
  4. Build observability into testing workflows with rich failure artifacts such as logs, traces, and dashboards
  5. Establish performance and reliability baselines for critical paths, including SLIs, SLOs, and regression detection

Skills

Required

  • Python
  • testing frameworks such as pytest
  • CI/CD systems (GitHub Actions preferred)
  • API testing
  • mocking/stubbing
  • data setup/teardown
  • defining quality standards
  • writing test plans
  • driving cross-team execution

Nice to have

  • load and performance testing tools such as k6
  • observability tooling such as Datadog or OpenTelemetry
  • Kubernetes and containerized environments
  • Helm
  • Terraform
  • Kubernetes networking
  • secrets management
  • SQL fluency
  • Go
  • Node
  • React

What the JD emphasized

  • Kubernetes
  • test strategy
  • testing
  • quality
  • evaluations