Snowflake Cortex Code: The AI Coding Agent for dbt and Airflow Teams

Celestinfo Software Solutions Pvt. Ltd. Mar 04, 2026

Quick answer: Snowflake Cortex Code is an AI coding agent unveiled on February 3, 2026 that generates dbt models, Airflow DAGs, and SQL transformations using your actual Snowflake metadata as context. Unlike general purpose tools like GitHub Copilot, Cortex Code reads your schemas, understands column types and relationships, and produces code that fits your real data environment. You can choose between Anthropic Claude Opus 4.6 or OpenAI GPT-5.2 as the underlying model. On February 23, 2026, Snowflake expanded Cortex Code to support any data, anywhere.

Last updated: March 2026

What Is Cortex Code and Why Should You Care?

If you spend your days writing dbt models, debugging Airflow DAGs, or wrangling SQL transformations, you have probably wished for an assistant that actually understands your data. Not just SQL syntax. Your actual tables, your column names, your business logic.

That is exactly what Snowflake built with Cortex Code. Unveiled on February 3, 2026, Cortex Code is an AI coding agent designed specifically for data engineering teams. It plugs into your Snowflake environment, reads your metadata, and generates code that is contextually aware of your enterprise data landscape.

This is not another chatbot that writes generic SQL snippets. Cortex Code knows that your orders table has a status column with values like 'completed', 'pending', and 'cancelled'. It knows that customer_id in your orders table maps to id in your customers table. It uses all of that context when generating code for you.

How Cortex Code Actually Works

The magic is in the metadata layer. When you connect Cortex Code to your Snowflake account, it does something that general purpose coding assistants simply cannot do. It reads your Snowflake information schema, including table definitions, column data types, primary and foreign key relationships, access policies, and even table comments.

Here is what happens under the hood when you ask Cortex Code to generate a dbt model:

  1. Metadata retrieval: Cortex Code queries your Snowflake metadata to understand the source tables, their columns, data types, and relationships.
  2. Context assembly: It builds a contextual prompt that includes your schema information, any existing dbt models in your project, and your specific request.
  3. Code generation: The underlying language model (your choice of Claude Opus 4.6 or GPT-5.2) generates the code using all of that context.
  4. Validation: The generated code is checked against your schema to catch obvious mismatches like referencing columns that do not exist.

The result is code that you can actually run. Not placeholder code with your_table_here comments scattered throughout.

Choosing Your Model: Claude Opus 4.6 vs GPT-5.2

One of the more interesting design decisions Snowflake made is letting you pick the underlying AI model. You are not locked into a single provider. The two options available at launch are Anthropic Claude Opus 4.6 and OpenAI GPT-5.2.

In practice, the choice comes down to what you are generating. Our team has been experimenting with both, and here is what we have noticed so far:

Neither is definitively better. The good news is that switching between them takes a single configuration change, so you can try both on the same task and keep whichever output you prefer.

Practical Use Case: Generating dbt Models From Raw Tables

Let us walk through a real scenario. You have a raw table called raw_payments that just landed in your Snowflake staging schema through your ELT pipeline. You need to build a staging model and a mart model on top of it.

With Cortex Code, you describe what you want in plain English:

Cortex Code prompt
Create a dbt staging model for raw_payments that:
- Renames columns to snake_case
- Casts amount to decimal(18,2)
- Filters out records where status = 'test'
- Adds a surrogate key using payment_id and order_id

Cortex Code reads the actual raw_payments table definition from Snowflake, sees the real column names (maybe they are PaymentID, OrderID, Amount, Status, CreatedAt), and generates something like this:

SQL (dbt model) generated by Cortex Code
WITH source AS (
    SELECT * FROM {{ source('raw', 'raw_payments') }}
),

renamed AS (
    SELECT
        {{ dbt_utils.generate_surrogate_key(['PaymentID', 'OrderID']) }} AS payment_key,
        PaymentID AS payment_id,
        OrderID AS order_id,
        CAST(Amount AS DECIMAL(18, 2)) AS amount,
        Status AS status,
        CreatedAt AS created_at
    FROM source
    WHERE Status != 'test'
)

SELECT * FROM renamed

Notice that the column names match the actual source table. It did not guess. It read the metadata and used the real column names. That is the difference between context-aware code generation and a generic autocomplete.

Practical Use Case: Creating Airflow DAGs

The second major workflow Cortex Code supports is Apache Airflow. Say you need a DAG that orchestrates your daily dbt run, waits for a source table to be refreshed, and sends a Slack notification on failure.

Cortex Code generates the DAG scaffold with the correct Snowflake operator imports, your connection ID, and sensible defaults for retries and timeouts. It even structures the task dependencies correctly based on your description.

Python (Airflow DAG) generated by Cortex Code
from airflow import DAG
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator
from airflow.providers.dbt.cloud.operators.dbt import DbtCloudRunJobOperator
from airflow.providers.slack.notifications.slack import send_slack_notification
from datetime import datetime, timedelta

default_args = {
    'owner': 'data_team',
    'retries': 2,
    'retry_delay': timedelta(minutes=5),
    'on_failure_callback': send_slack_notification(
        slack_conn_id='slack_default',
        text='DAG {{ dag.dag_id }} failed on {{ ds }}'
    ),
}

with DAG(
    dag_id='daily_payments_pipeline',
    default_args=default_args,
    schedule_interval='@daily',
    start_date=datetime(2026, 1, 1),
    catchup=False,
    tags=['payments', 'dbt'],
) as dag:

    check_source = SnowflakeOperator(
        task_id='check_source_freshness',
        snowflake_conn_id='snowflake_default',
        sql="""SELECT COUNT(*) FROM raw.raw_payments
               WHERE created_at >= DATEADD('hour', -24, CURRENT_TIMESTAMP())""",
    )

    run_dbt = DbtCloudRunJobOperator(
        task_id='run_dbt_models',
        dbt_cloud_conn_id='dbt_cloud_default',
        job_id=12345,
        wait_for_termination=True,
    )

    check_source >> run_dbt

Is this production ready as is? Probably not. You will want to adjust the job ID, fine tune the retry logic, and add your own error handling. But as a starting point, it saves you 30 minutes of boilerplate work and gets the structure right from the start.

Cortex Code vs GitHub Copilot: A Fair Comparison

The obvious question is: why not just use GitHub Copilot? It is a great tool. We use it daily for general software engineering work. But for data engineering specifically, the two tools solve different problems.

Feature Cortex Code GitHub Copilot
Data contextReads Snowflake metadata (schemas, columns, types, relationships)Uses open files in your editor as context
Best fordbt models, Airflow DAGs, SQL transformations, Snowflake-specific codeGeneral programming across all languages
Model choiceClaude Opus 4.6 or GPT-5.2GPT-4o and other OpenAI models
SecurityRuns within Snowflake security perimeter, RBAC enforcedCode sent to GitHub/OpenAI servers
Column awarenessKnows actual column names and types from your warehouseGuesses based on code patterns in open files

The short version: use Copilot for writing Python functions, React components, and general code. Use Cortex Code for anything that touches your Snowflake data layer.

The February 23 Expansion: Any Data, Anywhere

Three weeks after the initial launch, Snowflake expanded Cortex Code on February 23, 2026 (announced via BusinessWire) to support working with data beyond Snowflake's own environment. The tagline was "any data, anywhere."

What this means in practice: Cortex Code can now generate transformation logic for data sources outside of Snowflake, including external tables, data shares, and even code that interacts with non-Snowflake databases. The metadata awareness still works best within Snowflake (where it has direct access to the information schema), but the code generation capabilities are no longer limited to Snowflake-only workflows.

For teams running hybrid architectures with some data in Snowflake and some in other systems, this is a meaningful upgrade. You can generate an Airflow DAG that pulls from a PostgreSQL source, transforms in Snowflake, and loads results back to an external system, all from one Cortex Code session.

What Cortex Code Gets Right

Where Cortex Code Still Needs Work

No tool is perfect, and Cortex Code is still early. Here are the rough edges we have encountered:

Getting Started With Cortex Code

Here is how to set up Cortex Code on your Snowflake account. You will need ACCOUNTADMIN or a role with sufficient privileges.

Step 1: Enable Cortex Code

Cortex Code is available through the Snowflake CLI. Make sure your Snowflake CLI is updated to the latest version, then enable Cortex Code for your account:

Shell
snow cortex code enable --account your_account

Step 2: Configure Your Model Preference

Set your preferred AI model. You can change this anytime:

Shell
snow cortex code config --model claude-opus-4-6
# or
snow cortex code config --model gpt-5.2

Step 3: Point It at Your dbt Project

Navigate to your dbt project directory and let Cortex Code index your existing models:

Shell
cd /path/to/your/dbt/project
snow cortex code init --project-type dbt

Step 4: Generate Your First Model

Now you can ask Cortex Code to generate code. It will use your Snowflake metadata and your existing dbt project structure as context:

Shell
snow cortex code generate \
  "Create a staging model for raw.raw_customers with proper naming conventions"

When Should You Adopt Cortex Code?

If your team is already running Snowflake and dbt in production, Cortex Code is worth trying today. The setup takes less than 15 minutes, and even if you only use it for scaffolding new models, the time savings add up quickly across a team of data engineers.

If you are still evaluating Snowflake or have not adopted dbt yet, there is no rush. The tool is most valuable when you have an established data environment with meaningful metadata for it to read.

For teams considering a move to Snowflake, our Snowflake consulting practice can help you plan an implementation that takes advantage of Cortex Code from day one.


Key Takeaways


Mohan, Senior Data Engineer

Mohan specializes in Databricks, Spark optimization, and large-scale data platform engineering at CelestInfo. He architects data solutions using dbt, Airflow, and Snowflake for enterprise clients.

Related Articles

Frequently Asked Questions

Is Snowflake Cortex Code free to use?

Cortex Code consumes Snowflake credits when it runs. The exact cost depends on the underlying model you select and the complexity of the task. There is no separate subscription fee. Snowflake bills Cortex Code usage through your existing credit allocation.

Can I choose which AI model Cortex Code uses?

Yes. You can select between Anthropic Claude Opus 4.6 and OpenAI GPT-5.2 as the underlying language model. Switching between them is a single configuration change, so you can try both on the same task.

How is Cortex Code different from GitHub Copilot?

Copilot is a general purpose code assistant that works across any programming language. Cortex Code is purpose built for data engineering within Snowflake. It reads your Snowflake metadata, understands your schemas, column types, and relationships, and generates code that fits your actual data context. Copilot does not have access to your warehouse metadata.

Is my data safe when using Cortex Code?

Cortex Code operates within Snowflake's security perimeter. It reads metadata (table names, column names, data types) but does not access actual row level data. All processing happens within Snowflake's infrastructure. Standard RBAC policies apply to what Cortex Code can see.

Does Cortex Code work with tools beyond dbt and Airflow?

Yes. The February 23, 2026 expansion extended Cortex Code to support working with any data and any tool. While dbt and Apache Airflow were the initial focus, the expanded version can generate SQL, Python, and configuration files for a broader range of data engineering workflows.

Burning Questions
About CelestInfo

Simple answers to make things clear.

Cortex Code consumes Snowflake credits based on the underlying model you select (Claude Opus 4.6 or GPT-5.2) and task complexity. There is no separate subscription. Usage is billed through your existing Snowflake credit allocation.

Yes. Cortex Code lets you select either model. Switching is a single configuration change, so you can try both on the same task and use whichever produces the better output for your workflow.

Copilot is general purpose. Cortex Code reads your Snowflake metadata (schemas, columns, types, relationships) to generate contextually accurate data engineering code. Copilot cannot access your warehouse metadata.

Yes. Cortex Code reads metadata only (table names, column names, data types), not your actual row-level data. All processing stays within Snowflake's security perimeter, and standard RBAC policies apply.

Yes. Since the February 23, 2026 expansion, Cortex Code works with any data engineering tool. While dbt and Airflow were the initial focus, it now generates SQL, Python, and config files for a wider range of workflows.

Still have questions?

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.