Snowflake Cortex Code: The AI Coding Agent for dbt and Airflow Teams
Quick answer: Snowflake Cortex Code is an AI coding agent unveiled on February 3, 2026 that generates dbt models, Airflow DAGs, and SQL transformations using your actual Snowflake metadata as context. Unlike general purpose tools like GitHub Copilot, Cortex Code reads your schemas, understands column types and relationships, and produces code that fits your real data environment. You can choose between Anthropic Claude Opus 4.6 or OpenAI GPT-5.2 as the underlying model. On February 23, 2026, Snowflake expanded Cortex Code to support any data, anywhere.
Last updated: March 2026
What Is Cortex Code and Why Should You Care?
If you spend your days writing dbt models, debugging Airflow DAGs, or wrangling SQL transformations, you have probably wished for an assistant that actually understands your data. Not just SQL syntax. Your actual tables, your column names, your business logic.
That is exactly what Snowflake built with Cortex Code. Unveiled on February 3, 2026, Cortex Code is an AI coding agent designed specifically for data engineering teams. It plugs into your Snowflake environment, reads your metadata, and generates code that is contextually aware of your enterprise data landscape.
This is not another chatbot that writes generic SQL snippets. Cortex Code knows that your orders table has a status column with values like 'completed', 'pending', and 'cancelled'. It knows that customer_id in your orders table maps to id in your customers table. It uses all of that context when generating code for you.
How Cortex Code Actually Works
The magic is in the metadata layer. When you connect Cortex Code to your Snowflake account, it does something that general purpose coding assistants simply cannot do. It reads your Snowflake information schema, including table definitions, column data types, primary and foreign key relationships, access policies, and even table comments.
Here is what happens under the hood when you ask Cortex Code to generate a dbt model:
- Metadata retrieval: Cortex Code queries your Snowflake metadata to understand the source tables, their columns, data types, and relationships.
- Context assembly: It builds a contextual prompt that includes your schema information, any existing dbt models in your project, and your specific request.
- Code generation: The underlying language model (your choice of Claude Opus 4.6 or GPT-5.2) generates the code using all of that context.
- Validation: The generated code is checked against your schema to catch obvious mismatches like referencing columns that do not exist.
The result is code that you can actually run. Not placeholder code with your_table_here comments scattered throughout.
Choosing Your Model: Claude Opus 4.6 vs GPT-5.2
One of the more interesting design decisions Snowflake made is letting you pick the underlying AI model. You are not locked into a single provider. The two options available at launch are Anthropic Claude Opus 4.6 and OpenAI GPT-5.2.
In practice, the choice comes down to what you are generating. Our team has been experimenting with both, and here is what we have noticed so far:
- Claude Opus 4.6 tends to produce more thorough dbt model documentation and better handles complex SQL with multiple CTEs. It also tends to add more comprehensive test definitions when generating dbt schema YAML files.
- GPT-5.2 is slightly faster for straightforward tasks and does well with Python based Airflow DAG generation. It also handles Jinja templating in dbt models cleanly.
Neither is definitively better. The good news is that switching between them takes a single configuration change, so you can try both on the same task and keep whichever output you prefer.
Practical Use Case: Generating dbt Models From Raw Tables
Let us walk through a real scenario. You have a raw table called raw_payments that just landed in your Snowflake staging schema through your ELT pipeline. You need to build a staging model and a mart model on top of it.
With Cortex Code, you describe what you want in plain English:
Create a dbt staging model for raw_payments that: - Renames columns to snake_case - Casts amount to decimal(18,2) - Filters out records where status = 'test' - Adds a surrogate key using payment_id and order_id
Cortex Code reads the actual raw_payments table definition from Snowflake, sees the real column names (maybe they are PaymentID, OrderID, Amount, Status, CreatedAt), and generates something like this:
WITH source AS (
SELECT * FROM {{ source('raw', 'raw_payments') }}
),
renamed AS (
SELECT
{{ dbt_utils.generate_surrogate_key(['PaymentID', 'OrderID']) }} AS payment_key,
PaymentID AS payment_id,
OrderID AS order_id,
CAST(Amount AS DECIMAL(18, 2)) AS amount,
Status AS status,
CreatedAt AS created_at
FROM source
WHERE Status != 'test'
)
SELECT * FROM renamed
Notice that the column names match the actual source table. It did not guess. It read the metadata and used the real column names. That is the difference between context-aware code generation and a generic autocomplete.
Practical Use Case: Creating Airflow DAGs
The second major workflow Cortex Code supports is Apache Airflow. Say you need a DAG that orchestrates your daily dbt run, waits for a source table to be refreshed, and sends a Slack notification on failure.
Cortex Code generates the DAG scaffold with the correct Snowflake operator imports, your connection ID, and sensible defaults for retries and timeouts. It even structures the task dependencies correctly based on your description.
from airflow import DAG
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator
from airflow.providers.dbt.cloud.operators.dbt import DbtCloudRunJobOperator
from airflow.providers.slack.notifications.slack import send_slack_notification
from datetime import datetime, timedelta
default_args = {
'owner': 'data_team',
'retries': 2,
'retry_delay': timedelta(minutes=5),
'on_failure_callback': send_slack_notification(
slack_conn_id='slack_default',
text='DAG {{ dag.dag_id }} failed on {{ ds }}'
),
}
with DAG(
dag_id='daily_payments_pipeline',
default_args=default_args,
schedule_interval='@daily',
start_date=datetime(2026, 1, 1),
catchup=False,
tags=['payments', 'dbt'],
) as dag:
check_source = SnowflakeOperator(
task_id='check_source_freshness',
snowflake_conn_id='snowflake_default',
sql="""SELECT COUNT(*) FROM raw.raw_payments
WHERE created_at >= DATEADD('hour', -24, CURRENT_TIMESTAMP())""",
)
run_dbt = DbtCloudRunJobOperator(
task_id='run_dbt_models',
dbt_cloud_conn_id='dbt_cloud_default',
job_id=12345,
wait_for_termination=True,
)
check_source >> run_dbt
Is this production ready as is? Probably not. You will want to adjust the job ID, fine tune the retry logic, and add your own error handling. But as a starting point, it saves you 30 minutes of boilerplate work and gets the structure right from the start.
Cortex Code vs GitHub Copilot: A Fair Comparison
The obvious question is: why not just use GitHub Copilot? It is a great tool. We use it daily for general software engineering work. But for data engineering specifically, the two tools solve different problems.
| Feature | Cortex Code | GitHub Copilot |
|---|---|---|
| Data context | Reads Snowflake metadata (schemas, columns, types, relationships) | Uses open files in your editor as context |
| Best for | dbt models, Airflow DAGs, SQL transformations, Snowflake-specific code | General programming across all languages |
| Model choice | Claude Opus 4.6 or GPT-5.2 | GPT-4o and other OpenAI models |
| Security | Runs within Snowflake security perimeter, RBAC enforced | Code sent to GitHub/OpenAI servers |
| Column awareness | Knows actual column names and types from your warehouse | Guesses based on code patterns in open files |
The short version: use Copilot for writing Python functions, React components, and general code. Use Cortex Code for anything that touches your Snowflake data layer.
The February 23 Expansion: Any Data, Anywhere
Three weeks after the initial launch, Snowflake expanded Cortex Code on February 23, 2026 (announced via BusinessWire) to support working with data beyond Snowflake's own environment. The tagline was "any data, anywhere."
What this means in practice: Cortex Code can now generate transformation logic for data sources outside of Snowflake, including external tables, data shares, and even code that interacts with non-Snowflake databases. The metadata awareness still works best within Snowflake (where it has direct access to the information schema), but the code generation capabilities are no longer limited to Snowflake-only workflows.
For teams running hybrid architectures with some data in Snowflake and some in other systems, this is a meaningful upgrade. You can generate an Airflow DAG that pulls from a PostgreSQL source, transforms in Snowflake, and loads results back to an external system, all from one Cortex Code session.
What Cortex Code Gets Right
- Metadata awareness eliminates guesswork. The biggest pain point with generic AI coding tools is that they hallucinate column names and table structures. Cortex Code reads the real schema, so the generated SQL actually references columns that exist.
- dbt conventions are baked in. Generated models follow dbt best practices: CTEs over subqueries, proper use of
{{ source() }}and{{ ref() }}, and appropriate surrogate key generation. - Security stays inside Snowflake. Your metadata does not leave Snowflake's infrastructure. For enterprises with strict data governance requirements, this is a big deal compared to sending code context to external AI services.
- Model flexibility. Being able to choose between Claude Opus 4.6 and GPT-5.2 means you are not stuck with one provider's strengths and weaknesses.
Where Cortex Code Still Needs Work
No tool is perfect, and Cortex Code is still early. Here are the rough edges we have encountered:
- Complex business logic still needs human review. Cortex Code handles structural transformations well (renames, casts, joins, filters) but struggles with nuanced business rules. If your revenue calculation has five edge cases, you will need to spell those out explicitly.
- Airflow DAG generation is template-level. The generated DAGs are solid scaffolds, but production DAGs usually need custom error handling, dynamic task generation, and integration with your specific alerting stack. Expect to modify 20 to 30 percent of the generated code.
- Limited IDE integration (for now). As of early March 2026, Cortex Code is primarily a CLI tool. There is no VS Code extension yet. If you live in your IDE, the workflow of switching to a CLI can feel disjointed.
Getting Started With Cortex Code
Here is how to set up Cortex Code on your Snowflake account. You will need ACCOUNTADMIN or a role with sufficient privileges.
Step 1: Enable Cortex Code
Cortex Code is available through the Snowflake CLI. Make sure your Snowflake CLI is updated to the latest version, then enable Cortex Code for your account:
snow cortex code enable --account your_account
Step 2: Configure Your Model Preference
Set your preferred AI model. You can change this anytime:
snow cortex code config --model claude-opus-4-6 # or snow cortex code config --model gpt-5.2
Step 3: Point It at Your dbt Project
Navigate to your dbt project directory and let Cortex Code index your existing models:
cd /path/to/your/dbt/project snow cortex code init --project-type dbt
Step 4: Generate Your First Model
Now you can ask Cortex Code to generate code. It will use your Snowflake metadata and your existing dbt project structure as context:
snow cortex code generate \ "Create a staging model for raw.raw_customers with proper naming conventions"
When Should You Adopt Cortex Code?
If your team is already running Snowflake and dbt in production, Cortex Code is worth trying today. The setup takes less than 15 minutes, and even if you only use it for scaffolding new models, the time savings add up quickly across a team of data engineers.
If you are still evaluating Snowflake or have not adopted dbt yet, there is no rush. The tool is most valuable when you have an established data environment with meaningful metadata for it to read.
For teams considering a move to Snowflake, our Snowflake consulting practice can help you plan an implementation that takes advantage of Cortex Code from day one.
Key Takeaways
- Snowflake Cortex Code launched February 3, 2026 as an AI coding agent built specifically for data engineering workflows with dbt and Apache Airflow support.
- It differentiates from GitHub Copilot by reading your actual Snowflake metadata (table names, column types, relationships) to generate context-aware code.
- You can choose between Anthropic Claude Opus 4.6 and OpenAI GPT-5.2 as the underlying language model.
- The February 23, 2026 expansion added support for working with data sources beyond Snowflake.
- Best used for scaffolding dbt models, generating Airflow DAGs, and debugging SQL. Still needs human review for complex business logic.
- All processing stays within Snowflake's security perimeter, which makes it suitable for enterprises with strict data governance policies.
Related Articles
Frequently Asked Questions
Is Snowflake Cortex Code free to use?
Cortex Code consumes Snowflake credits when it runs. The exact cost depends on the underlying model you select and the complexity of the task. There is no separate subscription fee. Snowflake bills Cortex Code usage through your existing credit allocation.
Can I choose which AI model Cortex Code uses?
Yes. You can select between Anthropic Claude Opus 4.6 and OpenAI GPT-5.2 as the underlying language model. Switching between them is a single configuration change, so you can try both on the same task.
How is Cortex Code different from GitHub Copilot?
Copilot is a general purpose code assistant that works across any programming language. Cortex Code is purpose built for data engineering within Snowflake. It reads your Snowflake metadata, understands your schemas, column types, and relationships, and generates code that fits your actual data context. Copilot does not have access to your warehouse metadata.
Is my data safe when using Cortex Code?
Cortex Code operates within Snowflake's security perimeter. It reads metadata (table names, column names, data types) but does not access actual row level data. All processing happens within Snowflake's infrastructure. Standard RBAC policies apply to what Cortex Code can see.
Does Cortex Code work with tools beyond dbt and Airflow?
Yes. The February 23, 2026 expansion extended Cortex Code to support working with any data and any tool. While dbt and Apache Airflow were the initial focus, the expanded version can generate SQL, Python, and configuration files for a broader range of data engineering workflows.
