What you'd actually do

Architect and operate a medallion/curated data warehouse stack (bronze/silver/gold) for product, usage, billing, and operational data.

Build and maintain Airflow orchestrated pipelines and dbt transformation projects (modular, tested, documented).

Design analytics-ready models: SCD Type 2, star schemas, and appropriate normalization for upstream canonical layers.

Lead Master Data Management (MDM) patterns (golden records, reference data, deduping, identity resolution).

Implement and automate data quality checks (freshness, nulls, referential integrity, distribution drift, anomaly detection).

What the JD emphasized

Strong warehouse fundamentals and production experience delivering trusted datasets and metrics.

Expert SQL (window functions, dimensional modeling, performance tuning).

Hands-on with dbt (models, tests, docs, snapshots, macros) and Airflow (DAG design, backfills, reliability).

High standards for data quality, reliability, and maintainability.

About the Role

Together AI is building high-performance inference compute and the software platform around it. We’re looking for an early-career Data Warehouse Engineer with strong fundamentals and high growth potential to grow into a technical lead over time. You’ll contribute to designing and operating our data warehouse, ETL pipelines and orchestration, work on core data models and metrics, and help raise the bar on data quality and governance across the org — with mentorship and support from experienced engineers.

Requirements

0–4 years of professional experience (or strong internships/projects) working with data warehouses, pipelines, or analytics engineering.
Solid SQL fundamentals — you’re comfortable writing queries and have some exposure to window functions or dimensional modeling concepts.
Some hands-on experience with dbt or Airflow, or strong eagerness to learn — coursework and personal projects count.
Basic Python for scripting and data tooling; any exposure to Spark (PySpark/SQL) is a plus.
Familiarity with data modeling concepts like SCD2 or star schemas — even if only from coursework.
Good communication skills: you can ask clarifying questions, explain your reasoning, and work with stakeholders to understand their needs.
High standards for data quality, reliability, and maintainability — you care about getting things right.

Responsibilities

Contribute to building and maintaining a medallion/curated data warehouse stack (bronze/silver/gold) for product, usage, billing, and operational data.
Build and maintain Airflow orchestrated pipelines and dbt transformation projects (modular, tested, documented).
Help design analytics-ready models: SCD Type 2, star schemas, and appropriate normalization for upstream canonical layers.
Learn and apply Master Data Management (MDM) patterns (golden records, reference data, deduping, identity resolution).
Implement data quality checks (freshness, nulls, referential integrity, distribution drift, anomaly detection).
Contribute to data governance habits: data stewardship, ownership, SLAs, and clear definitions for “source of truth.”
Help build and maintain a business semantic layer (consistent metric definitions, dimensions, and reusable logic) used by notebooks/BI.
Partner with stakeholders (Product, Engineering, Finance, GTM, Ops) to translate questions into durable datasets and metrics.
Use SQL, Python, and Spark where scale demands it; optimize for correctness, performance, and cost.

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $130,000 - $170,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy

About the Role

Requirements

0–4 years of professional experience (or strong internships/projects) working with data warehouses, pipelines, or analytics engineering.
Solid SQL fundamentals — you’re comfortable writing queries and have some exposure to window functions or dimensional modeling concepts.
Some hands-on experience with dbt or Airflow, or strong eagerness to learn — coursework and personal projects count.
Basic Python for scripting and data tooling; any exposure to Spark (PySpark/SQL) is a plus.
Familiarity with data modeling concepts like SCD2 or star schemas — even if only from coursework.
Good communication skills: you can ask clarifying questions, explain your reasoning, and work with stakeholders to understand their needs.
High standards for data quality, reliability, and maintainability — you care about getting things right.

Responsibilities

Contribute to building and maintaining a medallion/curated data warehouse stack (bronze/silver/gold) for product, usage, billing, and operational data.
Build and maintain Airflow orchestrated pipelines and dbt transformation projects (modular, tested, documented).
Help design analytics-ready models: SCD Type 2, star schemas, and appropriate normalization for upstream canonical layers.
Learn and apply Master Data Management (MDM) patterns (golden records, reference data, deduping, identity resolution).
Implement data quality checks (freshness, nulls, referential integrity, distribution drift, anomaly detection).
Contribute to data governance habits: data stewardship, ownership, SLAs, and clear definitions for “source of truth.”
Help build and maintain a business semantic layer (consistent metric definitions, dimensions, and reusable logic) used by notebooks/BI.
Partner with stakeholders (Product, Engineering, Finance, GTM, Ops) to translate questions into durable datasets and metrics.
Use SQL, Python, and Spark where scale demands it; optimize for correctness, performance, and cost.

About Together AI

Compensation

Equal Opportunity

Please see our privacy policy at https://www.together.ai/privacy

Data Warehouse Engineer

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

About the Role

Requirements

Responsibilities

About Together AI

Compensation

Equal Opportunity

About the Role

Requirements

Responsibilities

About Together AI

Compensation

Equal Opportunity