Senior Software Engineer - Cloud Services

The Trade Desk The Trade Desk · Media · San Francisco, CA · Software Engineering

Senior Software Engineer focused on building and maintaining scalable, service-oriented infrastructure solutions for a globally distributed system. The role involves creating self-service tooling and automation to improve product team efficiency and operational health, with elements of SRE mindset including configuration management, capacity modeling, and monitoring. Experience with Kubernetes, Kafka, cloud providers, and infrastructure as code is required.

What you'd actually do

  1. Create and maintain in-house service oriented solutions at scale for the infrastructure required to run a globally distributed system handling over 15 million requests per second
  2. Help product teams ship more efficiently and safely through automation, tools, and processes which can be used by all teams at The Trade Desk.
  3. Ensure supportability by innovating solutions for our infrastructure through building, implementing, operating, and adding features to self-service tooling and automation.
  4. Participate in root-cause analysis and postmortem discussions to effectively drive long-term operational health improvements.
  5. Analyze for process gaps and implement solutions to speed up execution and reduce manual toil.

Skills

Required

  • TypeScript
  • Go/Golang
  • C#
  • Designing, developing, deploying, and supporting service-oriented applications
  • Kubernetes
  • Docker
  • ArgoCD
  • Backstage
  • Kafka
  • Service Discovery (i.e. Consul)
  • AWS
  • Azure
  • Alibaba Cloud (Aliyun)
  • Linux operating system internals, filesystems, storage technologies, protocols, and networking stack
  • Terraform
  • Ansible
  • CloudFormation
  • systems design
  • always-on systems
  • data driven approach
  • reducing complexity and cutting operational risks
  • cost and return on investment analysis
  • making significant and self-directed, contributions to large and impactful projects
  • communication
  • documentation

Nice to have

  • understanding of the advantages or drawbacks to various approaches

What the JD emphasized

  • globally distributed system handling over 15 million requests per second
  • self-service tooling and automation
  • root-cause analysis and postmortem discussions
  • speed up execution and reduce manual toil