Staff Software Engineer- Search Quality

Databricks Databricks · Data AI · Bangalore, India · Engineering - Pipeline

Databricks is seeking a Staff Software Engineer for Search Quality to build and optimize the retrieval backbone for both AI agents and human users. This role involves hybrid retrieval (keyword and semantic search), dual-optimization of ranking models, ensuring high-stakes accuracy for AI grounding, and handling data heterogeneity. The engineer will apply IR principles to RAG systems and build evaluation frameworks for relevance metrics.

What you'd actually do

  1. You will own the quality of results for two distinct but deeply connected "users": The AI Agent: Optimizing the retrieval layer that allows LLMs to reason over data they weren’t trained on, ensuring they have the "ground truth" needed to synthesize accurate, high-stakes business actions.
  2. The Human User: Improving the traditional search experience so employees can find assets and answers through intuitive, high-recall interfaces.
  3. Hybrid Retrieval: Balancing traditional keyword-based search (for exactness) with semantic vector search (for intent).
  4. Dual-Optimization: Fine-tuning ranking models that satisfy both human readability and LLM-ready context.
  5. High-Stakes Accuracy: In a world of Agents, poor search quality leads to hallucinations. You will build the guardrails and relevance scoring that ensure our AI stays grounded in reality.

Skills

Required

  • Lucene/Elasticsearch
  • embeddings
  • ranking algorithms
  • evaluation frameworks
  • human-in-the-loop evaluation
  • LLM-based evaluation
  • nDCG
  • MRR
  • Precision@K
  • Retrieval-Augmented Generation (RAG)

Nice to have

  • traditional keyword-based search
  • semantic vector search

What the JD emphasized

  • relevance metrics
  • relevance metrics

Other signals

  • building the retrieval backbone for AI agents
  • optimizing the retrieval layer that allows LLMs to reason over data
  • building the guardrails and relevance scoring that ensure our AI stays grounded in reality
  • applying traditional Information Retrieval (IR) wisdom to the cutting-edge world of Retrieval-Augmented Generation (RAG)