Sr. Engineer - Data & ML Platform (hybrid)

CrowdStrike CrowdStrike · Enterprise · Bangalore, India

This role focuses on building and facilitating adoption of a modern Data+ML platform, modularizing ML code, establishing repeatable patterns for model development, deployment, and monitoring, and building a scalable platform for ML experimentation pipelines. It involves leveraging workflow orchestration tools, cloud services, and CI/CD frameworks, with a future focus on generative AI use cases. The role emphasizes production-focused ML engineering and bridging the gap between model development and operational success.

What you'd actually do

  1. Help design, build, and facilitate adoption of a modern Data+ML platform
  2. Modularize complex ML code into standardized and repeatable components
  3. Establish and facilitate adoption of repeatable patterns for model development, deployment, and monitoring
  4. Build a platform that scales to thousands of users and offers self-service capability to build ML experimentation pipelines
  5. Leverage workflow orchestration tools to deploy efficient and scalable execution of complex data and ML pipelines

Skills

Required

  • B.S. in Computer Science, Data Science, Statistics, Applied Mathematics, or a related field and 10+ years related experience; or M.S. with 8+ years of experience; or Ph.D with 6+ years of experience.
  • 3+ years experience developing and deploying machine learning solutions to production.
  • Familiarity with typical machine learning algorithms from an engineering perspective
  • familiarity with supervised / unsupervised approaches: how, why, and when and labeled data is created and used
  • 3+ years experience with ML Platform tools like Jupyter Notebooks, NVidia Workbench, MLFlow, Ray, Vertex AI etc.
  • Experience building data platform product(s) or features with (one of) Apache Spark, Flink or comparable tools in GCP.
  • Proficiency in distributed computing and orchestration technologies (Kubernetes, Airflow, etc.)
  • Production experience with infrastructure-as-code tools such as Terraform, FluxCD
  • Expert level experience with Python
  • Expert level experience with CI/CD frameworks such as GitHub Actions
  • Expert level experience with containerization frameworks
  • Strong analytical and problem solving skills, capable of working in a dynamic environment
  • Exceptional interpersonal and communication skills.
  • Distributed Systems Knowledge
  • Data Platform Experience
  • Machine Learning concepts

Nice to have

  • Java/Scala exposure
  • Iceberg is highly desirable
  • Go
  • Iceberg
  • Pinot or other time-series/OLAP-style database
  • Jenkins
  • Parquet
  • Protocol Buffers/GRPC

What the JD emphasized

  • production-focused culture
  • model development
  • deployment
  • monitoring
  • ML pipelines
  • ML Experimentation Platform

Other signals

  • ML Experimentation Platform from the ground up
  • production-focused culture that bridges the gap between model development and operational success
  • generative AI investments