Senior Database Reliability Engineer (dbre)

Okta Okta · Enterprise · United States · Tech Ops-610

Okta is seeking a Senior Database Reliability Engineer (DBRE) with deep expertise in PostgreSQL and MySQL to design, operationalize, and optimize the data persistence layer for their mission-critical systems. This hands-on engineering role focuses on building resilient data infrastructure, ensuring performance, reliability, and automation through architecture, automation, and incident response.

What you'd actually do

  1. Design, implement, and operate highly available PostgreSQL clusters (physical replication, logical replication, sharding/partitioning, failover automation).
  2. Optimize query performance, indexing strategies, schema design, and storage engines.
  3. Develop automation for any and all tasks including but not limited to: provisioning, configuration, backups, failovers, vacuum tuning, and schema management using tools such as Terraform, Ansible, Kubernetes Operators, or custom tooling.
  4. Lead response during database incidents—performance regressions, replication lag, deadlocks, bloat issues, storage failures, etc.
  5. Partner with software engineers to review SQL, optimize schemas, and ensure efficient use of PostgreSQL features.

Skills

Required

  • 4 plus years of hands-on PostgreSQL experience in high-volume, distributed, or large-scale production environments.
  • Strong knowledge of PostgreSQL internals (WAL, MVCC, bloat/ vacuum tuning, query planner, indexing, logical replication).
  • Production experience with MySQL (InnoDB internals, replication, performance tuning).
  • Advanced SQL and strong understanding of schema design and query optimization.
  • Experience with Linux systems, networking fundamentals, and systems troubleshooting.
  • Experience building automation with Go or Python.
  • Production experience with monitoring tools (Prometheus, Grafana, Datadog, PMM, pg_stat_statements, etc.).
  • Hands-on experience with cloud environments (AWS or GCP).

Nice to have

  • Experience with PgBouncer, HAProxy, or other connection-pooling/load-balancing layers.
  • Exposure to event streaming (Kafka, Debezium) and change data capture.
  • Experience supporting 24/7 production environments with on-call rotation.
  • Contributions to open-source PostgreSQL ecosystem.

What the JD emphasized

  • 4 plus years of hands-on PostgreSQL experience in high-volume, distributed, or large-scale production environments.
  • Strong knowledge of PostgreSQL internals (WAL, MVCC, bloat/ vacuum tuning, query planner, indexing, logical replication).
  • Production experience with MySQL (InnoDB internals, replication, performance tuning).
  • Advanced SQL and strong understanding of schema design and query optimization.
  • Experience building automation with Go or Python.
  • Production experience with monitoring tools (Prometheus, Grafana, Datadog, PMM, pg_stat_statements, etc.).
  • Hands-on experience with cloud environments (AWS or GCP).