Migrating from Redshift to Snowflake: A Step-by-Step Playbook

Q: What is the best way to move data from Redshift to Snowflake?

The most common approach is UNLOAD from Redshift to S3 as Parquet files, then COPY INTO from S3 to Snowflake. This avoids data transfer fees between AWS services and uses both platforms' native bulk load capabilities. For ongoing replication, consider AWS DMS or tools like Matillion.

Chakri

Published Mar 06, 2025 · Last updated Feb 2026

Celestinfo Software Solutions Pvt. Ltd. • Mar 06, 2025

Last updated: March 2025

Quick answer: UNLOAD your Redshift data to S3 as Parquet, COPY INTO Snowflake, translate Redshift-specific SQL (DISTKEY/SORTKEY/DISTSTYLE become unnecessary, GETDATE() becomes CURRENT_TIMESTAMP(), CONVERT becomes TO_CHAR/TO_DATE), run row-count and hash-based validation, then cutover. Plan 8-16 weeks. The biggest surprise will be type casting errors - Redshift's implicit casting is far more permissive than Snowflake's.

Why Teams Leave Redshift

Nobody migrates a data warehouse for fun. Teams leave Redshift because of real operational pain: cluster resizing requires downtime (or Elastic Resize, which still has limitations on node type changes), concurrency limits force users to wait in WLM queues during peak hours, and the VACUUM/ANALYZE maintenance cycle is a full-time job that never gets done. Snowflake's separation of storage and compute, automatic clustering, and per-query concurrency model solve these specific problems. That's the pitch. Here's how to actually do the migration.

Phase 1: Schema Audit

Before you move a single row, you need a complete inventory of what you're migrating. Run these queries against Redshift's system catalog:

SQL - Redshift schema audit

-- List all tables with row counts and sizes
SELECT schemaname, tablename,
       tbl_rows AS row_count,
       size AS size_mb
FROM SVV_TABLE_INFO
ORDER BY size DESC;

-- Identify DISTKEY/SORTKEY/DISTSTYLE per table
SELECT tablename, diststyle, sortkey1, sortkey_num
FROM SVV_TABLE_INFO
WHERE schema = 'public';

-- Find views and their definitions
SELECT schemaname, viewname, definition
FROM pg_views
WHERE schemaname NOT IN ('pg_catalog','information_schema');

Pay special attention to: tables with DISTSTYLE KEY (these relied on co-located joins - Snowflake doesn't need this), tables with compound or interleaved sort keys (Snowflake uses automatic micro-partition pruning instead), and any views using Redshift-specific functions.

Phase 2: SQL Translation

This is where most of the work happens. Here are the Redshift SQL patterns that need to change:

DDL Changes

DISTKEY / SORTKEY / DISTSTYLE: Remove entirely. Snowflake doesn't use distribution keys. Drop these clauses from all CREATE TABLE statements.
ENCODE: Remove column compression encodings (e.g., ENCODE lzo, ENCODE zstd). Snowflake handles compression automatically.
IDENTITY columns: Replace IDENTITY(1,1) with Snowflake's AUTOINCREMENT or IDENTITY (same keyword, slightly different syntax).

Function Translations

SQL - Common Redshift-to-Snowflake translations

-- GETDATE() -> CURRENT_TIMESTAMP()
-- Redshift:  SELECT GETDATE();
-- Snowflake: SELECT CURRENT_TIMESTAMP();

-- CONVERT(type, value) -> TO_CHAR / TO_DATE / CAST
-- Redshift:  SELECT CONVERT(VARCHAR, order_date, 112);
-- Snowflake: SELECT TO_CHAR(order_date, 'YYYYMMDD');

-- DATEADD with different syntax
-- Redshift:  SELECT DATEADD(day, 7, GETDATE());
-- Snowflake: SELECT DATEADD(day, 7, CURRENT_TIMESTAMP());

-- DATEDIFF returns different precision
-- Redshift:  SELECT DATEDIFF(hour, start_ts, end_ts);
-- Snowflake: SELECT DATEDIFF(hour, start_ts, end_ts);
-- (Same syntax, but watch for timestamp precision differences)

-- LISTAGG behavior differs
-- Redshift:  SELECT LISTAGG(name, ',') WITHIN GROUP (ORDER BY name)
-- Snowflake: SELECT LISTAGG(name, ',') WITHIN GROUP (ORDER BY name)
-- (Same syntax, but Snowflake has a 16MB limit per result)

-- NVL2 is supported in both, but check edge cases
-- ISNULL -> use NVL or COALESCE in Snowflake

-- Approximate count distinct
-- Redshift:  SELECT APPROXIMATE COUNT(DISTINCT user_id) FROM events;
-- Snowflake: SELECT APPROX_COUNT_DISTINCT(user_id) FROM events;

The biggest gotcha: Redshift's implicit type casting is far more permissive than Snowflake's. Redshift silently casts a VARCHAR to a NUMBER in a WHERE clause. Snowflake will throw Numeric value 'abc' is not recognized. You'll discover type mismatches you never knew existed in your data. Run your test queries against Snowflake early and fix casting issues before cutover. Use TRY_CAST() and TRY_TO_NUMBER() to handle dirty data gracefully.

Phase 3: Data Migration

The fastest path is S3 as the bridge:

UNLOAD from Redshift to S3 as Parquet (not CSV - Parquet preserves types and compresses better):

SQL - Redshift UNLOAD

UNLOAD ('SELECT * FROM schema.table_name')
TO 's3://migration-bucket/redshift-export/table_name/'
IAM_ROLE 'arn:aws:iam::123456789:role/RedshiftUnload'
FORMAT AS PARQUET
ALLOWOVERWRITE
PARALLEL ON
MAXFILESIZE 256 MB;

Create Snowflake external stage pointing to the S3 bucket.

COPY INTO from stage to Snowflake tables:

SQL - Snowflake COPY INTO

COPY INTO schema.table_name
FROM @migration_stage/redshift-export/table_name/
FILE_FORMAT = (TYPE = PARQUET)
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;

For ongoing replication during the migration period, consider AWS DMS (Database Migration Service) for CDC, or tools like Matillion or Talend for managed ELT. The Snowflake Migration Accelerator tool (available through Snowflake Professional Services) can automate much of the schema and SQL translation work. For details on loading data from Azure to Snowflake, see our Azure-to-Snowflake loading guide.

Phase 4: Performance Tuning in Snowflake

Warehouse sizing: Start with X-Small for development. Use Small or Medium for production queries. Don't over-provision - Snowflake's auto-suspend (set to 60-300 seconds) and auto-resume mean you only pay when queries are running.
Clustering keys: Most tables don't need them. Snowflake's automatic micro-partition pruning handles 90% of use cases. Add clustering keys only for very large tables (1TB+) where query patterns consistently filter on specific columns. Check pruning efficiency with SYSTEM$CLUSTERING_INFORMATION('table_name').
Result caching: Snowflake caches query results for 24 hours. If a dashboard runs the same query and the underlying data hasn't changed, the result returns instantly at zero compute cost. This alone often makes Snowflake feel faster than Redshift for BI workloads.
WLM queue mapping: Redshift's WLM (Workload Management) queues map to separate Snowflake virtual warehouses. Create a WH_ETL for pipeline jobs and a WH_BI for dashboard queries. They share storage but don't compete for compute. For more on this, see managing compute workloads for ETL vs analytics.

Phase 5: Testing and Validation

Row count validation: Compare SELECT COUNT(*) for every table between Redshift and Snowflake. This catches load failures and filtering issues.
Hash-based data comparison: For critical tables, compute a hash of key columns in both systems and compare. Use MD5(CONCAT(col1, col2, col3)) aggregated per partition to verify data integrity without comparing every row.
Query regression testing: Run your top 50 most-executed queries (from Redshift's STL_QUERY log) against both systems and compare results. Watch for subtle differences in NULL handling, date truncation, and floating-point precision.
BI dashboard validation: Have analysts open every production dashboard pointing at Snowflake and verify the numbers match. Don't skip this - dashboards often have implicit assumptions about query behavior that unit tests won't catch.

Phase 6: Cutover Planning

The cutover itself should be boring. If it's exciting, you didn't test enough. Here's the pattern that works:

Run Redshift and Snowflake in parallel for 2-4 weeks with data syncing via CDC.
Point non-critical dashboards at Snowflake first. Monitor for issues.
Gradually migrate critical workloads. Keep Redshift as a fallback.
Once all workloads are on Snowflake and stable for 1 week, decommission Redshift.
Don't forget to cancel the Redshift Reserved Instance - we've seen teams pay for idle RI nodes for months after migration.

Key Takeaways

UNLOAD to S3 as Parquet, COPY INTO Snowflake. It's the fastest and cheapest data migration path.
Drop DISTKEY, SORTKEY, DISTSTYLE, and ENCODE from all DDL. Snowflake doesn't need them.
Budget 30% of your SQL for translation work. The biggest pain point is implicit type casting differences.
Use TRY_CAST() in Snowflake to handle dirty data that Redshift silently accepted.
Map Redshift WLM queues to separate Snowflake warehouses. Set AUTO_SUSPEND to 60-300 seconds.
Run parallel systems for 2-4 weeks before cutover. Test every dashboard, not just every query.

Frequently Asked Questions

Q: How long does a Redshift to Snowflake migration take?

Typical migrations take 8-16 weeks depending on the number of schemas, SQL complexity, and downstream dependencies. Schema translation and SQL conversion usually take 2-4 weeks. Data migration takes 1-2 weeks. Testing and validation take 3-6 weeks.

Q: Do I need to rewrite all my Redshift SQL for Snowflake?

Not all of it. Standard ANSI SQL works in both. But Redshift-specific syntax like DISTKEY, SORTKEY, DISTSTYLE, WLM queues, GETDATE(), CONVERT(), and certain LISTAGG behaviors need to be translated. Expect 15-30% of SQL statements to require changes.

Q: What is the best way to move data from Redshift to Snowflake?

UNLOAD from Redshift to S3 as Parquet files, then COPY INTO from S3 to Snowflake. This avoids data transfer fees and uses both platforms' native bulk load capabilities.

Q: Does Snowflake need DISTKEY and SORTKEY equivalents?

No. Snowflake uses automatic micro-partition pruning instead of distribution and sort keys. For very large tables (1TB+) with predictable query patterns, you can optionally add clustering keys, but most tables perform well without them.

Chakri, Cloud Solutions Architect

Chakri is a Cloud Solutions Architect at CelestInfo with hands-on experience across AWS, Azure, GCP, and Snowflake cloud infrastructure.

Burning Questions
About CelestInfo

Simple answers to make things clear.

How accurate are the AI insights?+

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Can I integrate with my existing tools?+

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

What security measures do you have?+

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

How often are insights updated?+

Insights are updated in real-time as new data becomes available.

What kind of support do you offer?+

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Migrating from Redshift to Snowflake: A Step-by-Step Playbook

Why Teams Leave Redshift

Phase 1: Schema Audit

Phase 2: SQL Translation

DDL Changes

Function Translations

Phase 3: Data Migration

Phase 4: Performance Tuning in Snowflake

Phase 5: Testing and Validation

Phase 6: Cutover Planning

Key Takeaways

Frequently Asked Questions

Q: How long does a Redshift to Snowflake migration take?

Q: Do I need to rewrite all my Redshift SQL for Snowflake?

Q: What is the best way to move data from Redshift to Snowflake?

Q: Does Snowflake need DISTKEY and SORTKEY equivalents?

Related Articles

Burning Questions
About CelestInfo

Ready? Let's Talk!

Migrating from Redshift to Snowflake: A Step-by-Step Playbook

Why Teams Leave Redshift

Phase 1: Schema Audit

Phase 2: SQL Translation

DDL Changes

Function Translations

Phase 3: Data Migration

Phase 4: Performance Tuning in Snowflake

Phase 5: Testing and Validation

Phase 6: Cutover Planning

Key Takeaways

Frequently Asked Questions

Q: How long does a Redshift to Snowflake migration take?

Q: Do I need to rewrite all my Redshift SQL for Snowflake?

Q: What is the best way to move data from Redshift to Snowflake?

Q: Does Snowflake need DISTKEY and SORTKEY equivalents?

Related Articles

Burning QuestionsAbout CelestInfo

Burning Questions
About CelestInfo