Applied Data Scientist II

Microsoft Microsoft · Big Tech · Bengaluru, KA, IN · Applied Sciences

This role focuses on developing and operationalizing machine learning models for threat detection within Microsoft's security ecosystem. It involves building supervised and unsupervised models, applying graph-focused ML techniques, and analyzing large security datasets. The role also includes data engineering for feature pipelines and running experiments to improve detection quality, with a strong emphasis on translating research into production-ready solutions.

What you'd actually do

  1. Develop supervised and unsupervised ML models for anomaly detection, fraud/threat pattern discovery, alert classification, confidence scoring, and signal fidelity improvements.
  2. Contribute to graph construction logic, schema evolution, and ontology-driven enrichment for Verdict Net, Verdict Propagation, Campaign Graphs, and Vortex insights.
  3. Analyze large, noisy, high‑dimensional security datasets using ADX/Kusto, Spark, and distributed compute platforms.
  4. Collaborate with detection engineering, threat research, product teams and red teams to integrate ML outcomes into real-world protection experiences.

Skills

Required

  • Python
  • ML frameworks (PyTorch/TensorFlow)
  • data processing libraries
  • gradient-boosted models
  • supervised/unsupervised learning
  • embeddings
  • clustering
  • anomaly detection
  • Kusto
  • SQL
  • Spark
  • probability
  • statistics
  • algorithmic thinking

Nice to have

  • GNNs
  • graph embeddings
  • similarity scoring
  • relationship modeling
  • graph traversal
  • multi-hop reasoning
  • cluster detection algorithms
  • ADX

What the JD emphasized

  • 6+ years of hands-on DS/ML experience

Other signals

  • Develop supervised and unsupervised ML models for anomaly detection, fraud/threat pattern discovery, alert classification, confidence scoring, and signal fidelity improvements.
  • Apply graph-focused ML techniques (graph embeddings, GNNs, similarity scoring, relationship modeling).
  • Analyze large, noisy, high‑dimensional security datasets using ADX/Kusto, Spark, and distributed compute platforms.
  • Run A/B experiments, offline evaluations, and benchmark models to continually improve detection quality.