Specialist Solutions Architect - Data Engineering & Observability

Databricks Databricks · Data AI · TX · Remote · Field Engineering - FE Direct Emerging

This role focuses on guiding customers through cloud data engineering transformations using the Databricks Data Intelligence Platform. It involves hands-on production experience with large-scale data engineering technologies, lakehouse architecture, and data observability, including SIEM tools and anomaly detection. The Specialist Solutions Architect will architect production workloads, optimize pipelines, and assist with technical sales, focusing on data lake technology, big data streaming, and ingestion.

What you'd actually do

  1. Provide technical leadership to guide strategic customers to successful implementations on big data projects and large-scale data warehousing workloads.
  2. Prove the value of the Databricks Intelligence Platform for customer workloads by architecting production workloads, including end-to-end pipeline load performance testing and optimization.
  3. Architect production-level data pipelines, including end-to-end pipeline load performance testing and optimization.
  4. Become a technical expert in an area such as data lake technology, big data streaming, or big data ingestion and workflows.
  5. Assist Solution Architects with more advanced aspects of the technical sale, including custom proof of concept content, estimating workload sizing, and custom architectures.

Skills

Required

  • 5+ years of experience in a technical role with deep expertise across data engineering and data observability
  • Hands-on experience with data ingestion, streaming technologies (e.g., Spark Streaming, Kafka), performance tuning, troubleshooting, and debugging Spark or other big data solutions.
  • Experience building data-driven use cases, such as risk modeling, fraud detection, and customer lifetime value (LTV).
  • Experience with SIEM tools (e.g., Splunk, Elastic, Sentinel), telemetry/high-velocity log ingestion, and anomaly detection.
  • Proven track record of maintaining, scaling, and extending production data systems to evolve with complex business needs.
  • Deep expertise across multiple core data engineering domains, including: Designing and scaling cost-efficient, high-performance data workloads (ETL/ELT, analytics) in cloud environments.
  • Building and migrating large-scale data pipelines, including batch, CDC (Change Data Capture), and streaming ingestion.
  • Migrating on-premises or Hadoop-based data systems to modern cloud platforms (AWS, Azure, GCP).
  • Developing and managing modern lakehouse and warehouse systems, including Delta Lake technologies, data modeling, governance, and BI integration.
  • Production programming experience in SQL and at least one of the following: Python, Scala, or Java.
  • Ability to meet expectations for technical training and role-specific milestones within 6 months of hire.

Nice to have

  • Strong familiarity with cloud infrastructure providers (AWS, Azure, or GCP) is highly desirable.
  • Prior customer-facing experience in a pre-sales or post-sales technical role.
  • Willingness to travel up to 30% as needed.

What the JD emphasized

  • deep expertise across data engineering and data observability
  • Hands-on experience with data ingestion, streaming technologies (e.g., Spark Streaming, Kafka), performance tuning, troubleshooting, and debugging Spark or other big data solutions.
  • Experience building data-driven use cases, such as risk modeling, fraud detection, and customer lifetime value (LTV).
  • Experience with SIEM tools (e.g., Splunk, Elastic, Sentinel), telemetry/high-velocity log ingestion, and anomaly detection.
  • Proven track record of maintaining, scaling, and extending production data systems to evolve with complex business needs.
  • Designing and scaling cost-efficient, high-performance data workloads (ETL/ELT, analytics) in cloud environments.
  • Building and migrating large-scale data pipelines, including batch, CDC (Change Data Capture), and streaming ingestion.
  • Migrating on-premises or Hadoop-based data systems to modern cloud platforms (AWS, Azure, GCP).
  • Developing and managing modern lakehouse and warehouse systems, including Delta Lake technologies, data modeling, governance, and BI integration.