Azure SQL to Snowflake via Microsoft Fabric OneLake and ADLS Gen2 - Complete Pipeline Guide
Last updated: February 2026
Quick answer: To move data from Azure SQL to Snowflake via Microsoft Fabric, extract tables using a Fabric data pipeline into OneLake Lakehouse as Parquet files, create an ADLS Gen2 shortcut to the Lakehouse Files folder, configure a Snowflake external stage pointing to the ADLS Gen2 shortcut, then use a Snowflake stored procedure with INFER_SCHEMA and COPY INTO to dynamically create tables and load data.
Introduction
This Azure SQL to Snowflake pipeline guide demonstrates how to build an end-to-end automated data ingestion pipeline using Microsoft Fabric OneLake and ADLS Gen2. The pipeline extracts data from Azure SQL Database through Fabric data pipelines, stores it as Parquet in OneLake, and uses ADLS Gen2 shortcuts with Snowflake external stages for dynamic table creation and data loading.
Whether you are migrating data from Azure SQL to Snowflake or building an automated ELT pipeline, this Fabric-to-Snowflake approach provides a scalable, low-maintenance architecture that supports dynamic schema detection and hands-free data ingestion. For an alternative approach using Fabric mirroring, see our guide on Fabric mirroring with OneLake Snowflake catalog integration.
Architecture Overview
The data flows through the following stages in this Fabric data pipeline:
Azure SQL DB → Fabric Pipeline → OneLake Lakehouse Files → ADLS Gen2 Shortcut → Snowflake External Stage → Snowflake Dynamic Load
Tools and Technologies
| Component | Technology |
|---|---|
| Source Database | Azure SQL Database |
| Pipeline / ETL Tool | Microsoft Fabric Data Pipeline |
| Lakehouse Storage | Microsoft Fabric Lakehouse (OneLake) |
| External Storage | Azure Data Lake Storage Gen2 (ADLS Gen2) |
| Data Warehouse | Snowflake |
| Automation | Snowflake Stored Procedure + Task Scheduler |
Step 1 - Create Microsoft Fabric Workspace
Begin by setting up a dedicated workspace in Microsoft Fabric to organize all pipeline resources. Open the Microsoft Fabric portal, navigate to Workspaces, and create a new workspace (for example, "azuresnow"). If you do not already have a Fabric capacity assigned, enable the Fabric Trial capacity to get started. The workspace serves as the container for your Lakehouse, data pipelines, and any other Fabric artifacts used in this integration.
Step 2 - Create a Lakehouse in Fabric
Within your newly created workspace, click New item → Lakehouse and name it (for example, "azuresnow_lakehouse"). The Microsoft Fabric Lakehouse provides a unified storage layer built on OneLake, enabling you to store structured and semi-structured data in an open format. This Lakehouse will serve as the intermediate staging area where extracted Parquet files from Azure SQL Database are stored before they are accessed by Snowflake.
Step 3 - Load Azure SQL Data into OneLake
Create a new Data Pipeline in Microsoft Fabric and add a Copy Data activity. Configure the activity with the following settings:
- Source: Azure SQL Database (provide server name, database name, and authentication credentials)
- Destination: Fabric Lakehouse Files section
- Output Format: Parquet
- File Path: Files/raw/<table_name>/
This Fabric data pipeline extracts data from Azure SQL Database and writes it as Parquet files into the OneLake Lakehouse. The Parquet format is ideal for downstream analytics and is natively supported by Snowflake external stages, making it the preferred format for automated data ingestion workflows.
Issue - OneLake Direct Connection to Snowflake Failed
When attempting to connect Snowflake directly to Microsoft Fabric OneLake using the onelake.dfs.fabric.microsoft.com endpoint, Snowflake returns an IncorrectEndpointError (HTTP 400). This occurs because the OneLake endpoint does not fully support the Azure Blob Storage API operations that Snowflake requires for external stage functionality, specifically the LIST and COPY INTO commands.
This is a known compatibility limitation. Snowflake external stages rely on standard Azure Blob Storage or ADLS Gen2 endpoints to enumerate files and perform bulk data loading. Since OneLake exposes a different API surface, a workaround is required to bridge the two platforms.
Solution - ADLS Gen2 Shortcut Integration
The solution is to introduce an Azure Data Lake Storage Gen2 (ADLS Gen2) account as an intermediary storage layer. Follow these steps:
- Create an ADLS Gen2 Storage Account in Azure with Hierarchical Namespace enabled.
- Create a container named "fabricraw" (or your preferred name) within the storage account.
- In the Fabric Lakehouse, create a Shortcut that points to the ADLS Gen2 container URL using SAS Token authentication.
This ADLS Gen2 shortcut allows data written to the Fabric Lakehouse to be accessible via standard Azure Blob Storage endpoints, which Snowflake can connect to seamlessly. The shortcut acts as a bridge between Microsoft Fabric OneLake and Snowflake, resolving the IncorrectEndpointError without duplicating data.
Step 4 - Snowflake Integration Setup
With ADLS Gen2 configured, the next step is to set up the Snowflake storage integration, file format, and external stage. Execute the following SQL statements in your Snowflake worksheet.
4.1 - Create Database and Schema
CREATE OR REPLACE DATABASE FABRIC_DB;
CREATE OR REPLACE SCHEMA FABRIC_DB.RAW;
4.2 - Create Parquet File Format
CREATE OR REPLACE FILE FORMAT FABRIC_PARQUET_FMT
TYPE = PARQUET;
4.3 - Create Storage Integration
Note: Snowflake requires the blob.core.windows.net endpoint, not the dfs endpoint. This is critical for the storage integration to work correctly with ADLS Gen2.
CREATE OR REPLACE STORAGE INTEGRATION FABRIC_ADLS_INT
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = AZURE
ENABLED = TRUE
AZURE_TENANT_ID = '<your-tenant-id>'
STORAGE_ALLOWED_LOCATIONS = ('azure://dlsaazuresnow.blob.core.windows.net/fabricraw/');
4.4 - Grant Azure Consent
After creating the storage integration, you must grant consent in Azure Active Directory. Run the following command to retrieve the consent URL:
DESC STORAGE INTEGRATION FABRIC_ADLS_INT;
-- Copy the AZURE_CONSENT_URL value from the output
-- Open it in a browser and click Accept to grant consent
4.5 - Create External Stage
CREATE OR REPLACE STAGE FABRIC_ADLS_STAGE
URL = 'azure://dlsaazuresnow.blob.core.windows.net/fabricraw/'
STORAGE_INTEGRATION = FABRIC_ADLS_INT
FILE_FORMAT = FABRIC_PARQUET_FMT
DIRECTORY = (ENABLE = TRUE);
4.6 - Azure IAM Permission Configuration
Navigate to the ADLS Gen2 Storage Account in the Azure portal. Under Access Control (IAM), assign the "Storage Blob Data Contributor" role to the Snowflake Enterprise Application (the service principal created during the consent step). This grants Snowflake the necessary permissions to list and read Parquet files from the ADLS Gen2 container.
Step 5 - Dynamic Table Creation and Data Loading
The core automation in this pipeline is a Snowflake stored procedure called DYNAMIC_LOAD. This procedure scans the external stage directory for new Parquet files, dynamically creates tables using INFER_SCHEMA, loads data with COPY INTO, and tracks processed files to prevent duplicate loading.
CREATE OR REPLACE PROCEDURE FABRIC_DB.RAW.DYNAMIC_LOAD()
RETURNS STRING
LANGUAGE SQL
EXECUTE AS CALLER
AS
$$
BEGIN
-- Create file tracking table if it does not exist
CREATE TABLE IF NOT EXISTS FABRIC_DB.RAW.FILE_TRACKING (
FILE_NAME STRING,
TABLE_NAME STRING,
LOADED_AT TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
);
-- Refresh the external stage directory metadata
ALTER STAGE FABRIC_DB.RAW.FABRIC_ADLS_STAGE REFRESH;
-- Loop through all Parquet files in the stage directory
LET cur CURSOR FOR
SELECT RELATIVE_PATH
FROM DIRECTORY(@FABRIC_DB.RAW.FABRIC_ADLS_STAGE)
WHERE RELATIVE_PATH LIKE '%.parquet'
AND RELATIVE_PATH NOT IN (SELECT FILE_NAME FROM FABRIC_DB.RAW.FILE_TRACKING);
FOR rec IN cur DO
LET file_path STRING := rec.RELATIVE_PATH;
-- Extract table name from the file path (folder name)
LET tbl_name STRING := UPPER(SPLIT_PART(:file_path, '/', 1));
-- Check if the table already exists
LET tbl_exists BOOLEAN := (
SELECT COUNT(*) > 0
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'RAW' AND TABLE_NAME = :tbl_name
);
-- Create table dynamically if it does not exist
IF (NOT tbl_exists) THEN
EXECUTE IMMEDIATE
'CREATE TABLE FABRIC_DB.RAW.' || :tbl_name ||
' USING TEMPLATE (
SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*))
FROM TABLE(
INFER_SCHEMA(
LOCATION => ''@FABRIC_DB.RAW.FABRIC_ADLS_STAGE/' || :file_path || ''',
FILE_FORMAT => ''FABRIC_DB.RAW.FABRIC_PARQUET_FMT''
)
)
)';
END IF;
-- Load data from the Parquet file into the target table
EXECUTE IMMEDIATE
'COPY INTO FABRIC_DB.RAW.' || :tbl_name ||
' FROM @FABRIC_DB.RAW.FABRIC_ADLS_STAGE/' || :file_path ||
' FILE_FORMAT = (FORMAT_NAME = FABRIC_DB.RAW.FABRIC_PARQUET_FMT)' ||
' MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE';
-- Track the loaded file
INSERT INTO FABRIC_DB.RAW.FILE_TRACKING (FILE_NAME, TABLE_NAME)
VALUES (:file_path, :tbl_name);
END FOR;
RETURN 'DYNAMIC_LOAD completed successfully.';
END;
$$;
How the Stored Procedure Works
- Stage Refresh: Refreshes the external stage directory metadata to detect newly arrived Parquet files.
- File Discovery: Queries the stage directory and filters for
.parquetfiles not yet tracked in the FILE_TRACKING table. - Dynamic Table Creation: Uses
INFER_SCHEMAto detect column names and types from the Parquet file metadata, then creates the table usingUSING TEMPLATE. - Data Loading: Executes
COPY INTOwithMATCH_BY_COLUMN_NAME = CASE_INSENSITIVEfor flexible schema mapping between Parquet files and Snowflake tables. - File Tracking: Records each processed file in the FILE_TRACKING table to prevent duplicate loads on subsequent runs.
Step 6 - Automate with Snowflake Tasks
To fully automate the data ingestion pipeline, create a Snowflake Task that invokes the DYNAMIC_LOAD stored procedure on a recurring schedule. This enables hands-free, scheduled data loading from ADLS Gen2 into Snowflake.
CREATE OR REPLACE TASK FABRIC_AUTO_TASK
WAREHOUSE = COMPUTE_WH
SCHEDULE = '5 MINUTE'
AS
CALL FABRIC_DB.RAW.DYNAMIC_LOAD();
ALTER TASK FABRIC_AUTO_TASK RESUME;
The task runs every 5 minutes, automatically detecting and loading any new Parquet files that arrive in the ADLS Gen2 container. You can adjust the schedule interval based on your data freshness requirements. Use ALTER TASK FABRIC_AUTO_TASK SUSPEND; to pause the automated ingestion when needed.
Data Validation
After the pipeline executes, run the following validation queries in Snowflake to confirm that data has been loaded correctly and that all files have been tracked.
-- Verify tables were created in the RAW schema
SHOW TABLES IN SCHEMA FABRIC_DB.RAW;
-- Check row counts for loaded tables
SELECT TABLE_NAME, ROW_COUNT
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'RAW' AND TABLE_CATALOG = 'FABRIC_DB';
-- Review file tracking log
SELECT * FROM FABRIC_DB.RAW.FILE_TRACKING ORDER BY LOADED_AT DESC;
-- Sample data from a loaded table
SELECT * FROM FABRIC_DB.RAW.<TABLE_NAME> LIMIT 10;
-- Verify task execution history
SELECT *
FROM TABLE(INFORMATION_SCHEMA.TASK_HISTORY(TASK_NAME => 'FABRIC_AUTO_TASK'))
ORDER BY SCHEDULED_TIME DESC
LIMIT 10;
Conclusion
This pipeline provides a fully automated, end-to-end solution for moving data from Azure SQL Database through Microsoft Fabric OneLake and ADLS Gen2 into Snowflake. By combining the data extraction capabilities of Fabric data pipelines, the storage flexibility of ADLS Gen2 shortcuts, and the dynamic table creation power of Snowflake stored procedures, organizations can build scalable, low-maintenance data ingestion workflows. Once your data lands in Snowflake, use dbt for ELT transformations and ensure proper governance, security, and cost controls.
Frequently Asked Questions
Q: Why can't Snowflake connect directly to Microsoft Fabric OneLake?
Snowflake's external stage operations (LIST, COPY INTO) require endpoints that support Azure Blob Storage or ADLS Gen2 APIs. OneLake's dfs.fabric.microsoft.com endpoint does not fully support these operations, resulting in an IncorrectEndpointError. The solution is to use an ADLS Gen2 shortcut as an intermediary.
Q: Why does Snowflake use blob.core.windows.net instead of dfs.core.windows.net?
Snowflake's Azure storage integration works with the Azure Blob Storage endpoint (blob.core.windows.net). While ADLS Gen2 exposes both blob and dfs endpoints, Snowflake requires the blob endpoint for external stage operations.
Q: How does the dynamic table creation stored procedure work?
The procedure scans the external stage for new Parquet files, uses INFER_SCHEMA to detect column definitions, creates tables dynamically using USING TEMPLATE, and loads data with COPY INTO using MATCH_BY_COLUMN_NAME for flexible schema mapping.
Q: Can this pipeline handle schema changes in source data?
Yes. Since tables are created dynamically using INFER_SCHEMA and data is loaded with MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE, new columns in the source will be detected and mapped automatically when new files arrive.