Senior Infrastructure Engineer

Webflow Webflow · Enterprise · Argentina · Remote · Engineering

This role is for a Senior Infrastructure Engineer at Webflow, focusing on the reliability, scalability, and operational excellence of core datastores (MongoDB, PostgreSQL). The engineer will partner with product and platform teams, reduce toil through automation and IaC, lead root-cause analysis for datastore issues, establish guardrails and best practices, and occasionally debug the main Webflow application. The role also involves improving on-call processes and helping define team culture. While the company is described as 'AI-native', the core responsibilities of this role are focused on traditional infrastructure and data store management, not direct AI/ML model development or deployment.

What you'd actually do

  1. Join our Data Stores team, which owns the reliability, scalability, and operational excellence of core datastores (MongoDB, PostgreSQL), including schema and index strategy, connection best practices, and performance tuning.
  2. Partner cross-functionally with product, application, and platform teams (including embedded engagements) to support new use cases, onboard new products, and unblock dependency-driven work.
  3. Reduce toil by designing and implementing automation and Infrastructure as Code (IaC) for datastore provisioning, upgrades, backups, and disaster recovery.
  4. Lead efforts to identify systemic datastore issues, drive root-cause analysis, and implement durable fixes to improve long-term stability.
  5. Establish guardrails, standards, and best practices to ensure consistent and scalable adoption of datastores across teams; anticipating growth and avoiding repeat failure modes.

Skills

Required

  • 5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment with little to no downtime.
  • Hands-on experience operating and scaling production datastores (MongoDB and/or PostgreSQL), including performance tuning, schema or index design, and incident-level troubleshooting.
  • Experience supporting multiple teams using shared datastore infrastructure, including establishing guardrails, best practices, and patterns that scale with growth.
  • Experience navigating and scaling multi-tier cloud environments on AWS or GCP using containerized and Kubernetes-based architectures.
  • Experience with infrastructure-as-code tools such as Terraform or Pulumi, including managing stateful infrastructure.
  • TypeScript
  • Node
  • Go
  • AWS
  • GCP
  • Kubernetes
  • Terraform
  • Pulumi
  • MongoDB
  • PostgreSQL

Nice to have

  • Have experience designing or operating datastore platforms used by many teams, rather than a single application.
  • Have led or contributed to data migrations, major version upgrades, or the introduction of new database technologies in production.
  • Have experience with managed database services (e.g., MongoDB Atlas, RDS, Cloud SQL) and navigating tradeoffs around cost, reliability, and operational control.
  • Enjoy working in ambiguous problem spaces, helping define what “good” looks like for datastore reliability, scalability, and developer experience

What the JD emphasized

  • 5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment with little to no downtime.
  • Hands-on experience operating and scaling production datastores (MongoDB and/or PostgreSQL), including performance tuning, schema or index design, and incident-level troubleshooting.
  • Experience supporting multiple teams using shared datastore infrastructure, including establishing guardrails, best practices, and patterns that scale with growth.
  • Experience navigating and scaling multi-tier cloud environments on AWS or GCP using containerized and Kubernetes-based architectures.
  • Experience with infrastructure-as-code tools such as Terraform or Pulumi, including managing stateful infrastructure.