Getting Started with dbt and Snowflake: A Complete ELT Data Transformation Guide
Last updated: January 2026
Quick answer: dbt (data build tool) with Snowflake follows the ELT pattern: raw data is loaded into Snowflake first, then dbt transforms it using version-controlled SQL models. Connect dbt to Snowflake via Partner Connect, write SQL SELECT statements as models in the /models directory, define schema tests in YAML files, and run dbt run and dbt test to execute transformations.
Introduction
dbt with Snowflake lets you write SQL SELECT statements that automatically materialize as tables, views, or incremental models in your warehouse — with built-in testing, documentation, and lineage tracking. Setup takes 15 minutes: install dbt-snowflake, configure profiles.yml with your Snowflake credentials, and run dbt init. This guide covers the complete workflow from first model to production deployment with CI/CD. that integrates seamlessly with Snowflake.
This dbt Snowflake guide walks you through setting up dbt with Snowflake using Partner Connect, creating your first transformation models, defining data quality tests, and running transformations in dbt Cloud. By the end, you will understand the ELT architecture pattern and how dbt fits into your Snowflake data pipeline.
What is ELT Architecture?
ELT stands for Extract, Load, Transform. Unlike traditional ETL (Extract, Transform, Load), ELT loads raw data into the data warehouse first and then transforms it using the warehouse's processing power. For a comparison of how to manage compute workloads for ETL vs analytics in Snowflake, see our dedicated guide. This is the modern approach adopted by most cloud-native data platforms.
- Extract: Pull data from source systems such as APIs, databases, flat files, and SaaS applications.
- Load: Ingest raw data into Snowflake staging tables without modification. Tools like Fivetran, Airbyte, or Snowflake's COPY INTO command handle this step. You can load data from Azure Cloud, Google Cloud Platform, or AWS S3.
- Transform: Use dbt to clean, model, and aggregate data inside Snowflake. This is where business logic lives, and dbt manages it with version-controlled SQL.
This approach leverages Snowflake's elastic compute to handle transformations at scale, eliminating the need for separate ETL servers. You pay only for the compute time used during transformation, and Snowflake automatically scales resources up or down based on workload demands.
Why Use dbt with Snowflake?
dbt has become the standard transformation layer in the modern data stack for several compelling reasons:
- SQL-based transformations - no Python or Spark required. Analysts who know SQL can build and maintain production data models.
- Version-controlled data models using Git - every change to your transformation logic is tracked, reviewed, and auditable.
- Automated data quality testing with schema.yml - define tests like unique, not_null, accepted_values, and relationships to catch data issues before they reach dashboards.
- Built-in documentation generation - dbt automatically generates a searchable documentation site from your model descriptions and column definitions.
- Incremental model support for large datasets - process only new or changed rows instead of rebuilding entire tables, reducing compute cost and execution time.
- Native integration via Snowflake Partner Connect - connect dbt Cloud to Snowflake in minutes with automatic warehouse, database, and role provisioning.
Step 1: Connect dbt to Snowflake via Partner Connect
Snowflake Partner Connect provides a streamlined way to set up dbt Cloud with your Snowflake account. It automatically provisions all necessary resources so you can start building models immediately.
How to Set Up Partner Connect
- Log into your Snowflake account as ACCOUNTADMIN (required for Partner Connect).
- Navigate to Admin → Partner Connect in the left sidebar.
- Find and select dbt from the list of available partners.
- Click Connect to begin the automatic provisioning process.
Snowflake automatically creates the following resources:
| Resource | Name | Details |
|---|---|---|
| Warehouse | PC_DBT_WH | X-Small size, auto-suspend enabled |
| Database | PC_DBT_DB | Target database for dbt models |
| Role | PC_DBT_ROLE | Dedicated role with required permissions |
| Authentication | Programmatic access | Grants required permissions including programmatic authentication |
Step 2: Register for dbt Cloud
After the Partner Connect setup completes, Snowflake redirects you to dbt Cloud to finish account creation.
- You are redirected to the dbt Cloud signup page with your Snowflake connection details pre-filled.
- Provide your email address, account name, and create a password.
- Accept the dbt Cloud Terms of Service.
- Your dbt Cloud trial account is activated automatically.
- The Snowflake connection is pre-configured - no manual connection setup required.
The dbt Cloud Developer plan is free for individual use and includes everything you need to get started: a web-based IDE, job scheduling, and documentation hosting.
Step 3: Navigate dbt Cloud IDE
Once your account is set up, navigate to the dbt Cloud IDE (called dbt Studio) where you will write and test your transformation models.
- Open dbt Cloud and navigate to Studio from the top menu.
- dbt Studio provides an online IDE for writing SQL transformation models, previewing results, and running commands.
- Familiarize yourself with the key files and folders in your project:
| File / Folder | Purpose |
|---|---|
models/ |
SQL transformation files (each file becomes a table or view in Snowflake) |
schema.yml |
Model definitions, column descriptions, and data quality tests |
dbt_project.yml |
Project configuration including project name, version, and materialization defaults |
macros/ |
Reusable SQL snippets and Jinja templates |
seeds/ |
CSV files that dbt loads into Snowflake as reference tables |
Step 4: Create Your First dbt Model
A dbt model is a SQL SELECT statement saved as a .sql file in the models/ directory. When you run dbt, each model creates a corresponding table or view in Snowflake. Models use the ref() function to reference other models, which allows dbt to automatically determine the correct execution order.
Example Model: models/customer_orders.sql
SELECT
c.customer_id,
c.customer_name,
COUNT(o.order_id) AS total_orders,
SUM(o.order_amount) AS total_revenue,
MAX(o.order_date) AS last_order_date
FROM {{ ref('raw_customers') }} c
LEFT JOIN {{ ref('raw_orders') }} o
ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name
When you run dbt run, this SELECT statement executes against Snowflake and creates a table or view named customer_orders in the PC_DBT_DB database. The {{ ref('raw_customers') }} and {{ ref('raw_orders') }} references tell dbt to resolve the fully qualified table names and ensure those models run first.
Understanding Model Materializations
dbt supports several materialization strategies that control how your model is stored in Snowflake:
| Materialization | Behavior | Best For |
|---|---|---|
view |
Creates a SQL view (default) | Lightweight transformations, staging models |
table |
Creates a physical table (full rebuild each run) | Final mart tables, frequently queried models |
incremental |
Inserts only new or changed rows | Large fact tables, event data, append-only logs |
ephemeral |
Compiled as a CTE (no object created in Snowflake) | Intermediate logic reused across multiple models |
Step 5: Define Schema and Tests in schema.yml
The schema.yml file defines model metadata and data quality tests. These tests run automatically when you execute dbt test and help catch data issues before they propagate to downstream reports and dashboards.
Example schema.yml
version: 2
models:
- name: customer_orders
description: "Aggregated customer order summary"
columns:
- name: customer_id
description: "Unique customer identifier"
tests:
- unique
- not_null
- name: total_orders
description: "Total number of orders placed"
tests:
- not_null
- name: total_revenue
description: "Total revenue from all orders"
Available Built-in Tests
| Test | What It Checks |
|---|---|
unique |
Ensures every value in the column is unique (no duplicates) |
not_null |
Ensures no NULL values exist in the column |
accepted_values |
Ensures column values are within a specified list |
relationships |
Ensures referential integrity between two models (foreign key check) |
You can also write custom tests using SQL and Jinja. Custom tests are stored in the tests/ directory and return rows that fail the test condition.
Step 6: Compile and Run dbt Models
With your models and tests defined, you are ready to compile and execute them against Snowflake. dbt provides three core commands for this workflow:
dbt compile # Validates SQL and resolves ref() references
dbt run # Executes models and creates tables/views in Snowflake
dbt test # Runs data quality tests defined in schema.yml
What Each Command Does
- dbt compile: Parses all SQL files, resolves
ref()andsource()references, and generates compiled SQL. This step validates your project without executing anything against Snowflake. Use this to catch syntax errors early. - dbt run: Compiles each model, determines the correct execution order based on
ref()dependencies, and runs each SELECT statement against Snowflake. Tables and views are created or replaced in PC_DBT_DB. - dbt test: Executes all tests defined in
schema.ymlfiles. Each test runs a query against Snowflake and reports pass or fail results. Failed tests return the rows that violated the test condition.
After running dbt run, log into Snowflake and verify that your tables were created in the PC_DBT_DB database under the appropriate schema.
Best Practices for dbt with Snowflake
Follow these best practices to build maintainable, scalable, and reliable dbt projects on Snowflake:
- Use ref() to reference other models instead of hardcoding table names. This ensures dbt builds models in the correct dependency order and makes your project portable across environments.
- Organize models into staging, intermediate, and mart layers:
- Staging models - clean and rename raw source data (one model per source table).
- Intermediate models - apply business logic, joins, and calculations.
- Mart models - final tables optimized for reporting and analytics consumption.
- Use incremental models for large tables to avoid full refreshes. Configure the
is_incremental()macro to filter for only new or changed rows. - Add tests to every model to catch data quality issues early. At minimum, test primary keys for uniqueness and not_null.
- Use dbt docs generate to create automatic documentation. This produces a searchable website with model lineage graphs, column descriptions, and test definitions.
- Use Snowflake's warehouse sizing strategically. Start with X-Small for development and scale up for production runs with large datasets.
- Leverage dbt's source freshness checks to monitor when raw data was last loaded, ensuring your transformations run on up-to-date data.
Conclusion
dbt transforms Snowflake from a data warehouse into a complete analytics engineering platform. By combining Snowflake's elastic compute with dbt's SQL-based transformation framework, you get a powerful, testable, and version-controlled data pipeline that scales with your organization's needs.
In this guide, you learned about the ELT architecture pattern and how it differs from traditional ETL, how to connect dbt to Snowflake using Partner Connect, how to create SQL models and define data quality tests in schema.yml, and best practices for organizing and running your dbt project. With these foundations in place, you are ready to build production-grade data transformations on Snowflake using dbt.
Frequently Asked Questions
Q: What is the difference between ETL and ELT?
ETL transforms data before loading it into the warehouse, requiring separate compute infrastructure. ELT loads raw data first and transforms it inside the warehouse (like Snowflake), leveraging the warehouse's processing power for more scalable transformations.
Q: Is dbt free to use with Snowflake?
dbt Core (command-line) is free and open source. dbt Cloud offers a free Developer plan for individual use. Snowflake Partner Connect provides a trial dbt Cloud account with pre-configured Snowflake connectivity.
Q: What happens when you run 'dbt run'?
dbt compiles each model's SQL, resolves ref() dependencies to determine execution order, and then runs each SELECT statement against Snowflake, creating tables or views in the target database (PC_DBT_DB by default).
Q: Can dbt handle incremental data loads?
Yes. dbt supports incremental models that only process new or changed rows instead of rebuilding the entire table. You configure this with the incremental materialization strategy and an is_incremental() filter in your model SQL.