Sr. Software Engineer - Data Query Platform (gbr, Hybrid)

CrowdStrike CrowdStrike · Enterprise · London, United Kingdom

Senior Software Engineer role focused on building and operating a hyper-scale data lake for a cybersecurity company. The role involves developing and maintaining ultra-high-scale data platforms, processing petabytes of data using Java, Spark/Scala, and AWS native tooling, and enabling data access for analytics, machine learning, and threat hunting.

What you'd actually do

  1. Write highly fault-tolerant Java code within Apache Spark to produce platform products used by our customers to query our event pipelines/ingestion for insight into active threat trends and related analytics
  2. Design, develop, and maintain ultra-high-scale data platforms that process petabytes of data
  3. Participate in technical reviews of our products and help us develop new features and enhance stability
  4. Continually help us improve the efficiency and reduce latency of our high-performance services to delight our customers
  5. Research and implement new ways for both internal stakeholders as well as customers to query their data efficiently and extract results in the format they desire

Skills

Required

  • 10+ years' experience combined between backend/cloud development and data platform engineering roles
  • 5+ years of experience building data platform product(s) or features with (one of) Apache Spark, Flink or Iceberg, or with comparable tools in GCP
  • 5+ years of experience programming with Java, Scala or Kotlin.
  • Proven experience owning robust feature/product design end to end, yourself, especially with vaguely defined problem statements or only 'loose' specs leading the way.
  • Proven expertise with algorithms, distributed systems design and the software development lifecycle
  • Experience building large scale data/event pipelines
  • Expertise designing solutions with relational SQL and NoSQL databases, including Postgres/MySQL, Cassandra, DynamoDB
  • Good test driven development discipline
  • Reasonable proficiency with Linux administration tools
  • Proven ability to work effectively with remote teams

Nice to have

  • Go
  • Pinot or other time-series/OLAP-style database
  • Iceberg
  • Kubernetes
  • Jenkins
  • Parquet
  • Protocol Buffers/GRPC

What the JD emphasized

  • Proven experience owning robust feature/product design end to end, yourself, especially with vaguely defined problem statements or only 'loose' specs leading the way.