Snowpipe vs Snowpipe Streaming - Real-Time Data Ingestion in Snowflake

Snowpipe vs Snowpipe Streaming: Real-Time Data Ingestion in Snowflake

Celestinfo Software Solutions Pvt. Ltd. Nov 27, 2025

Last updated: December 2025

Quick answer: Snowpipe is file-based ingestion - it watches for new files on S3/GCS/ADLS and loads them via serverless compute with ~1-3 minute latency. Snowpipe Streaming is row-based - it uses the Snowflake Ingest SDK to insert rows directly without staging files, achieving sub-second latency. Use Snowpipe for batch file drops from ETL tools. Use Snowpipe Streaming for Kafka-style streaming, real-time CDC, or IoT sensor data.

Introduction

Snowflake gives you two serverless ingestion mechanisms, and picking the wrong one can cost you either latency or money. Snowpipe has been around since 2018 and handles file-based loading well. Snowpipe Streaming arrived later and works at the row level. They solve different problems, but the naming makes it easy to confuse them. This guide breaks down exactly how each works, what they cost, and when to use which.

How Snowpipe Classic Works


Snowpipe classic is a file-triggered, serverless ingestion service. You put files on cloud storage (S3, GCS, or Azure Blob/ADLS), and Snowpipe detects them and loads them into a Snowflake table. Under the hood, it uses a serverless compute pool managed by Snowflake - you don't provision a warehouse for it.


The File-Based Pipeline


The typical setup looks like this: your ETL tool or application writes files (CSV, JSON, Parquet, etc.) to an S3 bucket. An S3 event notification fires when a new file lands. That notification goes to an SQS queue that Snowpipe monitors. Snowpipe picks up the notification, spins up serverless compute, runs a COPY INTO operation, and marks the file as loaded.


Snowpipe Definition
CREATE PIPE my_pipe
  AUTO_INGEST = TRUE
AS
  COPY INTO my_database.my_schema.my_table
  FROM @my_s3_stage
  FILE_FORMAT = (TYPE = 'PARQUET');

The AUTO_INGEST = TRUE flag tells Snowpipe to listen for event notifications. Without it, you'd need to call the insertFiles REST API manually to trigger loads.


Latency and Polling Behavior


Here's the gotcha most people miss: Snowpipe doesn't process files instantly when it gets the SQS notification. It queues the file and processes it on an internal schedule. The minimum end-to-end latency is approximately 60 seconds, and in practice it's 1-3 minutes depending on queue depth and file size. If you're expecting sub-minute latency from Snowpipe classic, you won't get it. It's designed for micro-batch loading, not real-time streaming.


How Snowpipe Streaming Works


Snowpipe Streaming takes a fundamentally different approach. Instead of loading files from cloud storage, it accepts rows directly via the Snowflake Ingest SDK. There's no staging area, no files, no COPY INTO. Your application calls the SDK, passes rows, and they land in the Snowflake table. Latency is measured in seconds, not minutes.


The Row-Based Pipeline


Your application (or the Kafka connector) creates a channel to a target table using the Ingest SDK. It sends rows through the channel. Snowflake buffers them briefly and writes them to the table. The rows are queryable within seconds of being sent. No intermediate files, no staging tables, no COPY command.


The Ingest SDK is Java-only as of early 2026. If you're running a Python pipeline, you can't use Snowpipe Streaming directly - you'd need a Java wrapper service or use the Kafka connector, which handles the SDK integration for you.


Setting Up Snowpipe with S3 Event Notifications


The setup involves 3 components: the S3 bucket, the SQS queue, and the Snowpipe definition.


  1. Create a storage integration in Snowflake that grants access to your S3 bucket. This creates an IAM trust relationship between Snowflake's AWS account and yours. See our IAM role setup guide for the AWS side.
  2. Create an external stage pointing to the S3 path where files will land.
  3. Create the pipe with AUTO_INGEST = TRUE.
  4. Configure SQS notifications on the S3 bucket. Run SHOW PIPES to get the notification_channel ARN - that's the SQS queue Snowflake created. Add that as the destination in your S3 bucket's event notification configuration.

Complete Snowpipe Setup
-- Storage integration
CREATE STORAGE INTEGRATION s3_int
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = 'S3'
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-role'
  STORAGE_ALLOWED_LOCATIONS = ('s3://my-bucket/data/');

-- External stage
CREATE STAGE my_s3_stage
  STORAGE_INTEGRATION = s3_int
  URL = 's3://my-bucket/data/';

-- Pipe
CREATE PIPE my_pipe AUTO_INGEST = TRUE AS
  COPY INTO raw.events FROM @my_s3_stage
  FILE_FORMAT = (TYPE = 'JSON');

-- Get SQS ARN for S3 event config
SHOW PIPES LIKE 'my_pipe';

Configuring the Kafka Connector for Snowpipe Streaming


The Snowflake Kafka connector supports both Snowpipe classic and Snowpipe Streaming as ingestion methods. For Streaming, set snowflake.ingestion.method=SNOWPIPE_STREAMING in your connector configuration. The connector handles the Ingest SDK integration - it reads from Kafka topics and pushes rows through Snowpipe Streaming channels.


Kafka Connector Config (Snowpipe Streaming)
{
  "name": "snowflake-sink",
  "config": {
    "connector.class": "com.snowflake.kafka.connector.SnowflakeSinkConnector",
    "snowflake.url.name": "account.snowflakecomputing.com",
    "snowflake.user.name": "kafka_user",
    "snowflake.private.key": "${file:/secrets/snowflake_key.pem}",
    "snowflake.database.name": "RAW_DB",
    "snowflake.schema.name": "KAFKA",
    "snowflake.ingestion.method": "SNOWPIPE_STREAMING",
    "snowflake.role.name": "KAFKA_ROLE",
    "topics": "clickstream,user_events",
    "buffer.flush.time": "10",
    "buffer.count.records": "10000"
  }
}

The buffer.flush.time and buffer.count.records settings control how often the connector flushes rows to Snowflake. Lower values mean lower latency but more overhead. For most use cases, flushing every 10 seconds or every 10,000 records (whichever comes first) is a good starting point.


Monitoring: COPY_HISTORY and PIPE_STATUS


For Snowpipe classic, COPY_HISTORY in the INFORMATION_SCHEMA shows every file load - status, row count, errors, and load time. SYSTEM$PIPE_STATUS tells you the current state of the pipe: how many files are queued, how many are being processed, and the last time a file was loaded.


Monitoring Queries
-- Recent Snowpipe loads
SELECT * FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY(
  TABLE_NAME => 'MY_TABLE',
  START_TIME => DATEADD(hours, -24, CURRENT_TIMESTAMP())
));

-- Pipe status
SELECT SYSTEM$PIPE_STATUS('my_pipe');

-- Snowpipe Streaming channel status
SELECT * FROM TABLE(INFORMATION_SCHEMA.SNOWPIPE_STREAMING_FILE_MIGRATION_HISTORY(
  DATE_RANGE_START => DATEADD(hours, -24, CURRENT_TIMESTAMP())
));

Cost Comparison


This is where the decision gets interesting. Snowpipe classic charges per-file overhead - each file incurs a fixed cost for serverless compute regardless of file size. If you're loading thousands of tiny files (under 100MB each), the per-file overhead adds up fast. Snowpipe Streaming charges per-row, but the per-row cost is very small, and there's no file overhead.



The general rule: if your source produces files (ETL output, log rotation, data exports), use Snowpipe classic. If your source produces individual events or records (Kafka, CDC streams, IoT telemetry), use Snowpipe Streaming.


When to Use Snowpipe Classic



When to Use Snowpipe Streaming



Gotchas and Limitations



Key Takeaways



Pranay Vatsal, Founder & CEO

Pranay Vatsal is the Founder & CEO of CelestInfo with deep expertise in Snowflake, data architecture, and building production-grade data systems for global enterprises.

Related Articles

Frequently Asked Questions

Q: What is the difference between Snowpipe and Snowpipe Streaming?

Snowpipe is file-based: it detects new files on cloud storage (S3, GCS, ADLS) and loads them using serverless compute. Snowpipe Streaming is row-based: it uses the Snowflake Ingest SDK to insert rows directly into tables without staging files, achieving sub-second latency.

Q: What is the minimum latency for Snowpipe?

Snowpipe has a minimum latency of approximately 60 seconds because it polls for new files on an internal schedule. Actual end-to-end latency is typically 1-3 minutes depending on file size and queue depth.

Q: Does Snowpipe Streaming support Python?

As of early 2026, the Snowflake Ingest SDK for Snowpipe Streaming is Java-only. There is no official Python SDK for Snowpipe Streaming yet. Python-based pipelines typically use the Kafka connector with Snowpipe Streaming or fall back to Snowpipe classic.

Q: How is Snowpipe Streaming different from the Kafka connector?

The Snowflake Kafka connector can use Snowpipe Streaming as its underlying ingestion method (via the 'snowflake.ingestion.method=SNOWPIPE_STREAMING' config). The Kafka connector handles the Kafka consumer logic while Snowpipe Streaming handles the Snowflake-side row insertion.

Burning Questions
About CelestInfo

Simple answers to make things clear.

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

Insights are updated in real-time as new data becomes available.

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.