Staff Machine Learning Engineer

Databricks Databricks · Data AI · Mountain View, CA · Engineering - Pipeline

Databricks is seeking GenAI Engineers to drive the development and deployment of GenAI-powered products, focusing on enhancing LLM quality, expanding capabilities, and strengthening platform architecture. The role involves developing and implementing ML pipelines, building scalable backend systems, and working with cross-functional teams to deliver impactful AI solutions.

What you'd actually do

  1. Shape the direction of our applied AI areas and intelligence features in our products. Drive the development and deployment of state-of-the-art AI models and systems that directly impact the capabilities and performance of Databricks' products and services (e.g., Databricks Assistant and AI/BI Genie).
  2. Develop novel data collection, fine-tuning, and LLM technologies that achieve optimal performance on specific tasks and domains.
  3. Design and implement ML pipelines for data preprocessing, feature engineering, model training, hyperparameter tuning, and model evaluation, enabling rapid experimentation and iteration.
  4. Work closely with cross-functional teams, including AI researchers, ML engineers, and product teams, to deliver impactful AI solutions that enhance user productivity and satisfaction.
  5. Build scalable, reusable backend systems to support GenAI products across the company. Develop robust logging, telemetry, and evaluation harnesses to ensure reliable model performance.

Skills

Required

  • Python
  • TensorFlow/PyTorch
  • scalable ML architectures
  • LLM fine-tuning
  • prompt engineering
  • retrieval-augmented generation (RAG)
  • LLM technologies
  • generative and embedding techniques
  • modern model architectures
  • fine tuning / pre-training datasets
  • evaluation benchmarks
  • end-to-end model development
  • research and prototyping
  • deployment and monitoring
  • analytical and problem-solving skills
  • coding and software engineering skills
  • software engineering principles around testing, code reviews and deployment

Nice to have

  • LLM fine-tuning
  • prompt engineering
  • retrieval-augmented generation (RAG)

What the JD emphasized

  • state-of-the-art AI models and systems
  • novel data collection, fine-tuning, and LLM technologies
  • scalable, reusable backend systems
  • Strong track record of working with language modeling technologies
  • Ability to drive end-to-end model development, from research and prototyping to deployment and monitoring.

Other signals

  • GenAI products
  • LLM quality
  • GenAI capabilities
  • platform architecture