Principal Software Development Engineer

Expedia Expedia · Hospitality · Seattle, WA

Principal Software Development Engineer to lead the architecture and delivery of real-time, AI-powered fraud and abuse defenses on a global scale. This role involves designing cloud-native decisioning and automated remediation systems, simplifying the platform, and aligning cross-functional teams. The engineer will build, deploy, and operate ML in production, partner with DS/ML teams, and establish SLOs and observability.

What you'd actually do

  1. Own end-to-end architecture for low-latency risk decisioning (signals + rules + models) and automated remediation
  2. Build, deploy, and operate ML in production; partner closely with DS/ML on features, experimentation, and monitoring
  3. Simplify and modernize the platform (streaming/data pipelines, microservices, CI/CD, configuration-driven controls) to speed safe iteration
  4. Drive build-vs-buy evaluations and benchmark vendors against in-house solutions for performance, cost, and risk posture
  5. Establish clear SLOs, observability, and safety/rollback mechanisms; ensure security, privacy, and compliance are built in

Skills

Required

  • 10+ years of software engineering
  • 4+ years leading architecture for real-time, high-scale systems
  • Proven experience building and operating fraud/risk systems in production
  • Production ML experience (supervised/anomaly detection, feature pipelines, online inference, monitoring/retraining)
  • Cloud-native engineering at scale (AWS, GCP, or Azure)
  • event-driven services
  • data/stream processing
  • observability
  • cost and reliability best practices
  • Strong coding skills in one or more of Java/Scala/Go/Python
  • microservices
  • APIs
  • infrastructure-as-code
  • Security, privacy, and regulatory fundamentals for sensitive data and risk decisioning
  • Experience with executive communication

Nice to have

  • LLM/agentic techniques applied to fraud
  • retrieval over enterprise data
  • Graph/sequence modeling or entity resolution at scale
  • device and behavioral signals
  • Track record reducing manual operations via automation and platform simplification
  • Hands-on with experimentation platforms and real-time rules/model simulation/back testing
  • Deep cloud cost/performance tuning and capacity planning
  • Marketplace, travel, or e-commerce experience

What the JD emphasized

  • real-time, high-scale systems
  • fraud/risk systems in production
  • Production ML experience
  • Cloud-native engineering at scale
  • Security, privacy, and regulatory fundamentals

Other signals

  • AI-powered fraud and abuse defenses
  • real-time risk decisioning
  • ML in production
  • low-latency risk decisioning