Azure SQL to Snowflake via Microsoft Fabric OneLake Pipeline

Azure SQL to Snowflake via Microsoft Fabric OneLake and ADLS Gen2 - Complete Pipeline Guide

Celestinfo Software Solutions Pvt. Ltd. Jan 22, 2026

Last updated: February 2026

Quick answer: To move data from Azure SQL to Snowflake via Microsoft Fabric, extract tables using a Fabric data pipeline into OneLake Lakehouse as Parquet files, create an ADLS Gen2 shortcut to the Lakehouse Files folder, configure a Snowflake external stage pointing to the ADLS Gen2 shortcut, then use a Snowflake stored procedure with INFER_SCHEMA and COPY INTO to dynamically create tables and load data.

Introduction

This Azure SQL to Snowflake pipeline guide demonstrates how to build an end-to-end automated data ingestion pipeline using Microsoft Fabric OneLake and ADLS Gen2. The pipeline extracts data from Azure SQL Database through Fabric data pipelines, stores it as Parquet in OneLake, and uses ADLS Gen2 shortcuts with Snowflake external stages for dynamic table creation and data loading.


Whether you are migrating data from Azure SQL to Snowflake or building an automated ELT pipeline, this Fabric-to-Snowflake approach provides a scalable, low-maintenance architecture that supports dynamic schema detection and hands-free data ingestion. For an alternative approach using Fabric mirroring, see our guide on Fabric mirroring with OneLake Snowflake catalog integration.


Architecture Overview


The data flows through the following stages in this Fabric data pipeline:


Azure SQL DB → Fabric Pipeline → OneLake Lakehouse Files → ADLS Gen2 Shortcut → Snowflake External Stage → Snowflake Dynamic Load


Tools and Technologies


Component Technology
Source Database Azure SQL Database
Pipeline / ETL Tool Microsoft Fabric Data Pipeline
Lakehouse Storage Microsoft Fabric Lakehouse (OneLake)
External Storage Azure Data Lake Storage Gen2 (ADLS Gen2)
Data Warehouse Snowflake
Automation Snowflake Stored Procedure + Task Scheduler

Step 1 - Create Microsoft Fabric Workspace


Begin by setting up a dedicated workspace in Microsoft Fabric to organize all pipeline resources. Open the Microsoft Fabric portal, navigate to Workspaces, and create a new workspace (for example, "azuresnow"). If you do not already have a Fabric capacity assigned, enable the Fabric Trial capacity to get started. The workspace serves as the container for your Lakehouse, data pipelines, and any other Fabric artifacts used in this integration.


Step 2 - Create a Lakehouse in Fabric


Within your newly created workspace, click New item → Lakehouse and name it (for example, "azuresnow_lakehouse"). The Microsoft Fabric Lakehouse provides a unified storage layer built on OneLake, enabling you to store structured and semi-structured data in an open format. This Lakehouse will serve as the intermediate staging area where extracted Parquet files from Azure SQL Database are stored before they are accessed by Snowflake.


Step 3 - Load Azure SQL Data into OneLake


Create a new Data Pipeline in Microsoft Fabric and add a Copy Data activity. Configure the activity with the following settings:



This Fabric data pipeline extracts data from Azure SQL Database and writes it as Parquet files into the OneLake Lakehouse. The Parquet format is ideal for downstream analytics and is natively supported by Snowflake external stages, making it the preferred format for automated data ingestion workflows.


Issue - OneLake Direct Connection to Snowflake Failed


When attempting to connect Snowflake directly to Microsoft Fabric OneLake using the onelake.dfs.fabric.microsoft.com endpoint, Snowflake returns an IncorrectEndpointError (HTTP 400). This occurs because the OneLake endpoint does not fully support the Azure Blob Storage API operations that Snowflake requires for external stage functionality, specifically the LIST and COPY INTO commands.


This is a known compatibility limitation. Snowflake external stages rely on standard Azure Blob Storage or ADLS Gen2 endpoints to enumerate files and perform bulk data loading. Since OneLake exposes a different API surface, a workaround is required to bridge the two platforms.


Solution - ADLS Gen2 Shortcut Integration


The solution is to introduce an Azure Data Lake Storage Gen2 (ADLS Gen2) account as an intermediary storage layer. Follow these steps:


  1. Create an ADLS Gen2 Storage Account in Azure with Hierarchical Namespace enabled.
  2. Create a container named "fabricraw" (or your preferred name) within the storage account.
  3. In the Fabric Lakehouse, create a Shortcut that points to the ADLS Gen2 container URL using SAS Token authentication.

This ADLS Gen2 shortcut allows data written to the Fabric Lakehouse to be accessible via standard Azure Blob Storage endpoints, which Snowflake can connect to seamlessly. The shortcut acts as a bridge between Microsoft Fabric OneLake and Snowflake, resolving the IncorrectEndpointError without duplicating data.


Step 4 - Snowflake Integration Setup


With ADLS Gen2 configured, the next step is to set up the Snowflake storage integration, file format, and external stage. Execute the following SQL statements in your Snowflake worksheet.


4.1 - Create Database and Schema

SQL - Snowflake Database Setup
CREATE OR REPLACE DATABASE FABRIC_DB;
CREATE OR REPLACE SCHEMA FABRIC_DB.RAW;

4.2 - Create Parquet File Format

SQL - Parquet File Format
CREATE OR REPLACE FILE FORMAT FABRIC_PARQUET_FMT
TYPE = PARQUET;

4.3 - Create Storage Integration

Note: Snowflake requires the blob.core.windows.net endpoint, not the dfs endpoint. This is critical for the storage integration to work correctly with ADLS Gen2.

SQL - Storage Integration (Azure)
CREATE OR REPLACE STORAGE INTEGRATION FABRIC_ADLS_INT
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = AZURE
ENABLED = TRUE
AZURE_TENANT_ID = '<your-tenant-id>'
STORAGE_ALLOWED_LOCATIONS = ('azure://dlsaazuresnow.blob.core.windows.net/fabricraw/');

4.4 - Grant Azure Consent

After creating the storage integration, you must grant consent in Azure Active Directory. Run the following command to retrieve the consent URL:

SQL - Retrieve Consent URL
DESC STORAGE INTEGRATION FABRIC_ADLS_INT;
-- Copy the AZURE_CONSENT_URL value from the output
-- Open it in a browser and click Accept to grant consent

4.5 - Create External Stage

SQL - Snowflake External Stage
CREATE OR REPLACE STAGE FABRIC_ADLS_STAGE
URL = 'azure://dlsaazuresnow.blob.core.windows.net/fabricraw/'
STORAGE_INTEGRATION = FABRIC_ADLS_INT
FILE_FORMAT = FABRIC_PARQUET_FMT
DIRECTORY = (ENABLE = TRUE);

4.6 - Azure IAM Permission Configuration

Navigate to the ADLS Gen2 Storage Account in the Azure portal. Under Access Control (IAM), assign the "Storage Blob Data Contributor" role to the Snowflake Enterprise Application (the service principal created during the consent step). This grants Snowflake the necessary permissions to list and read Parquet files from the ADLS Gen2 container.


Step 5 - Dynamic Table Creation and Data Loading


The core automation in this pipeline is a Snowflake stored procedure called DYNAMIC_LOAD. This procedure scans the external stage directory for new Parquet files, dynamically creates tables using INFER_SCHEMA, loads data with COPY INTO, and tracks processed files to prevent duplicate loading.


SQL - Dynamic Load Stored Procedure
CREATE OR REPLACE PROCEDURE FABRIC_DB.RAW.DYNAMIC_LOAD()
RETURNS STRING
LANGUAGE SQL
EXECUTE AS CALLER
AS
$$
BEGIN
    -- Create file tracking table if it does not exist
    CREATE TABLE IF NOT EXISTS FABRIC_DB.RAW.FILE_TRACKING (
        FILE_NAME STRING,
        TABLE_NAME STRING,
        LOADED_AT TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
    );

    -- Refresh the external stage directory metadata
    ALTER STAGE FABRIC_DB.RAW.FABRIC_ADLS_STAGE REFRESH;

    -- Loop through all Parquet files in the stage directory
    LET cur CURSOR FOR
        SELECT RELATIVE_PATH
        FROM DIRECTORY(@FABRIC_DB.RAW.FABRIC_ADLS_STAGE)
        WHERE RELATIVE_PATH LIKE '%.parquet'
          AND RELATIVE_PATH NOT IN (SELECT FILE_NAME FROM FABRIC_DB.RAW.FILE_TRACKING);

    FOR rec IN cur DO
        LET file_path STRING := rec.RELATIVE_PATH;

        -- Extract table name from the file path (folder name)
        LET tbl_name STRING := UPPER(SPLIT_PART(:file_path, '/', 1));

        -- Check if the table already exists
        LET tbl_exists BOOLEAN := (
            SELECT COUNT(*) > 0
            FROM INFORMATION_SCHEMA.TABLES
            WHERE TABLE_SCHEMA = 'RAW' AND TABLE_NAME = :tbl_name
        );

        -- Create table dynamically if it does not exist
        IF (NOT tbl_exists) THEN
            EXECUTE IMMEDIATE
                'CREATE TABLE FABRIC_DB.RAW.' || :tbl_name ||
                ' USING TEMPLATE (
                    SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*))
                    FROM TABLE(
                        INFER_SCHEMA(
                            LOCATION => ''@FABRIC_DB.RAW.FABRIC_ADLS_STAGE/' || :file_path || ''',
                            FILE_FORMAT => ''FABRIC_DB.RAW.FABRIC_PARQUET_FMT''
                        )
                    )
                )';
        END IF;

        -- Load data from the Parquet file into the target table
        EXECUTE IMMEDIATE
            'COPY INTO FABRIC_DB.RAW.' || :tbl_name ||
            ' FROM @FABRIC_DB.RAW.FABRIC_ADLS_STAGE/' || :file_path ||
            ' FILE_FORMAT = (FORMAT_NAME = FABRIC_DB.RAW.FABRIC_PARQUET_FMT)' ||
            ' MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE';

        -- Track the loaded file
        INSERT INTO FABRIC_DB.RAW.FILE_TRACKING (FILE_NAME, TABLE_NAME)
        VALUES (:file_path, :tbl_name);

    END FOR;

    RETURN 'DYNAMIC_LOAD completed successfully.';
END;
$$;

How the Stored Procedure Works



Step 6 - Automate with Snowflake Tasks


To fully automate the data ingestion pipeline, create a Snowflake Task that invokes the DYNAMIC_LOAD stored procedure on a recurring schedule. This enables hands-free, scheduled data loading from ADLS Gen2 into Snowflake.


SQL - Snowflake Task Scheduler
CREATE OR REPLACE TASK FABRIC_AUTO_TASK
WAREHOUSE = COMPUTE_WH
SCHEDULE = '5 MINUTE'
AS
CALL FABRIC_DB.RAW.DYNAMIC_LOAD();

ALTER TASK FABRIC_AUTO_TASK RESUME;

The task runs every 5 minutes, automatically detecting and loading any new Parquet files that arrive in the ADLS Gen2 container. You can adjust the schedule interval based on your data freshness requirements. Use ALTER TASK FABRIC_AUTO_TASK SUSPEND; to pause the automated ingestion when needed.


Data Validation


After the pipeline executes, run the following validation queries in Snowflake to confirm that data has been loaded correctly and that all files have been tracked.


SQL - Validation Queries
-- Verify tables were created in the RAW schema
SHOW TABLES IN SCHEMA FABRIC_DB.RAW;

-- Check row counts for loaded tables
SELECT TABLE_NAME, ROW_COUNT
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'RAW' AND TABLE_CATALOG = 'FABRIC_DB';

-- Review file tracking log
SELECT * FROM FABRIC_DB.RAW.FILE_TRACKING ORDER BY LOADED_AT DESC;

-- Sample data from a loaded table
SELECT * FROM FABRIC_DB.RAW.<TABLE_NAME> LIMIT 10;

-- Verify task execution history
SELECT *
FROM TABLE(INFORMATION_SCHEMA.TASK_HISTORY(TASK_NAME => 'FABRIC_AUTO_TASK'))
ORDER BY SCHEDULED_TIME DESC
LIMIT 10;

Conclusion


This pipeline provides a fully automated, end-to-end solution for moving data from Azure SQL Database through Microsoft Fabric OneLake and ADLS Gen2 into Snowflake. By combining the data extraction capabilities of Fabric data pipelines, the storage flexibility of ADLS Gen2 shortcuts, and the dynamic table creation power of Snowflake stored procedures, organizations can build scalable, low-maintenance data ingestion workflows. Once your data lands in Snowflake, use dbt for ELT transformations and ensure proper governance, security, and cost controls.

Chakri, Cloud Solutions Architect

Chakri is a Cloud Solutions Architect at CelestInfo with hands-on experience across AWS, Azure, GCP, and Snowflake cloud infrastructure.


Frequently Asked Questions

Q: Why can't Snowflake connect directly to Microsoft Fabric OneLake?

Snowflake's external stage operations (LIST, COPY INTO) require endpoints that support Azure Blob Storage or ADLS Gen2 APIs. OneLake's dfs.fabric.microsoft.com endpoint does not fully support these operations, resulting in an IncorrectEndpointError. The solution is to use an ADLS Gen2 shortcut as an intermediary.

Q: Why does Snowflake use blob.core.windows.net instead of dfs.core.windows.net?

Snowflake's Azure storage integration works with the Azure Blob Storage endpoint (blob.core.windows.net). While ADLS Gen2 exposes both blob and dfs endpoints, Snowflake requires the blob endpoint for external stage operations.

Q: How does the dynamic table creation stored procedure work?

The procedure scans the external stage for new Parquet files, uses INFER_SCHEMA to detect column definitions, creates tables dynamically using USING TEMPLATE, and loads data with COPY INTO using MATCH_BY_COLUMN_NAME for flexible schema mapping.

Q: Can this pipeline handle schema changes in source data?

Yes. Since tables are created dynamically using INFER_SCHEMA and data is loaded with MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE, new columns in the source will be detected and mapped automatically when new files arrive.

Related Articles

Burning Questions
About CelestInfo

Simple answers to make things clear.

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

Insights are updated in real-time as new data becomes available.

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.