Migrating from Redshift to Snowflake

Migrating from Redshift to Snowflake: A Step-by-Step Playbook

Celestinfo Software Solutions Pvt. Ltd. Mar 06, 2025

Last updated: March 2025

Quick answer: UNLOAD your Redshift data to S3 as Parquet, COPY INTO Snowflake, translate Redshift-specific SQL (DISTKEY/SORTKEY/DISTSTYLE become unnecessary, GETDATE() becomes CURRENT_TIMESTAMP(), CONVERT becomes TO_CHAR/TO_DATE), run row-count and hash-based validation, then cutover. Plan 8-16 weeks. The biggest surprise will be type casting errors - Redshift's implicit casting is far more permissive than Snowflake's.

Why Teams Leave Redshift

Nobody migrates a data warehouse for fun. Teams leave Redshift because of real operational pain: cluster resizing requires downtime (or Elastic Resize, which still has limitations on node type changes), concurrency limits force users to wait in WLM queues during peak hours, and the VACUUM/ANALYZE maintenance cycle is a full-time job that never gets done. Snowflake's separation of storage and compute, automatic clustering, and per-query concurrency model solve these specific problems. That's the pitch. Here's how to actually do the migration.


Phase 1: Schema Audit


Before you move a single row, you need a complete inventory of what you're migrating. Run these queries against Redshift's system catalog:


SQL - Redshift schema audit
-- List all tables with row counts and sizes
SELECT schemaname, tablename,
       tbl_rows AS row_count,
       size AS size_mb
FROM SVV_TABLE_INFO
ORDER BY size DESC;

-- Identify DISTKEY/SORTKEY/DISTSTYLE per table
SELECT tablename, diststyle, sortkey1, sortkey_num
FROM SVV_TABLE_INFO
WHERE schema = 'public';

-- Find views and their definitions
SELECT schemaname, viewname, definition
FROM pg_views
WHERE schemaname NOT IN ('pg_catalog','information_schema');

Pay special attention to: tables with DISTSTYLE KEY (these relied on co-located joins - Snowflake doesn't need this), tables with compound or interleaved sort keys (Snowflake uses automatic micro-partition pruning instead), and any views using Redshift-specific functions.


Phase 2: SQL Translation


This is where most of the work happens. Here are the Redshift SQL patterns that need to change:


DDL Changes



Function Translations


SQL - Common Redshift-to-Snowflake translations
-- GETDATE() -> CURRENT_TIMESTAMP()
-- Redshift:  SELECT GETDATE();
-- Snowflake: SELECT CURRENT_TIMESTAMP();

-- CONVERT(type, value) -> TO_CHAR / TO_DATE / CAST
-- Redshift:  SELECT CONVERT(VARCHAR, order_date, 112);
-- Snowflake: SELECT TO_CHAR(order_date, 'YYYYMMDD');

-- DATEADD with different syntax
-- Redshift:  SELECT DATEADD(day, 7, GETDATE());
-- Snowflake: SELECT DATEADD(day, 7, CURRENT_TIMESTAMP());

-- DATEDIFF returns different precision
-- Redshift:  SELECT DATEDIFF(hour, start_ts, end_ts);
-- Snowflake: SELECT DATEDIFF(hour, start_ts, end_ts);
-- (Same syntax, but watch for timestamp precision differences)

-- LISTAGG behavior differs
-- Redshift:  SELECT LISTAGG(name, ',') WITHIN GROUP (ORDER BY name)
-- Snowflake: SELECT LISTAGG(name, ',') WITHIN GROUP (ORDER BY name)
-- (Same syntax, but Snowflake has a 16MB limit per result)

-- NVL2 is supported in both, but check edge cases
-- ISNULL -> use NVL or COALESCE in Snowflake

-- Approximate count distinct
-- Redshift:  SELECT APPROXIMATE COUNT(DISTINCT user_id) FROM events;
-- Snowflake: SELECT APPROX_COUNT_DISTINCT(user_id) FROM events;

The biggest gotcha: Redshift's implicit type casting is far more permissive than Snowflake's. Redshift silently casts a VARCHAR to a NUMBER in a WHERE clause. Snowflake will throw Numeric value 'abc' is not recognized. You'll discover type mismatches you never knew existed in your data. Run your test queries against Snowflake early and fix casting issues before cutover. Use TRY_CAST() and TRY_TO_NUMBER() to handle dirty data gracefully.


Phase 3: Data Migration


The fastest path is S3 as the bridge:


  1. UNLOAD from Redshift to S3 as Parquet (not CSV - Parquet preserves types and compresses better):
    SQL - Redshift UNLOAD
    UNLOAD ('SELECT * FROM schema.table_name')
    TO 's3://migration-bucket/redshift-export/table_name/'
    IAM_ROLE 'arn:aws:iam::123456789:role/RedshiftUnload'
    FORMAT AS PARQUET
    ALLOWOVERWRITE
    PARALLEL ON
    MAXFILESIZE 256 MB;
  2. Create Snowflake external stage pointing to the S3 bucket.
  3. COPY INTO from stage to Snowflake tables:
    SQL - Snowflake COPY INTO
    COPY INTO schema.table_name
    FROM @migration_stage/redshift-export/table_name/
    FILE_FORMAT = (TYPE = PARQUET)
    MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE;

For ongoing replication during the migration period, consider AWS DMS (Database Migration Service) for CDC, or tools like Matillion or Talend for managed ELT. The Snowflake Migration Accelerator tool (available through Snowflake Professional Services) can automate much of the schema and SQL translation work. For details on loading data from Azure to Snowflake, see our Azure-to-Snowflake loading guide.


Phase 4: Performance Tuning in Snowflake



Phase 5: Testing and Validation



Phase 6: Cutover Planning


The cutover itself should be boring. If it's exciting, you didn't test enough. Here's the pattern that works:


  1. Run Redshift and Snowflake in parallel for 2-4 weeks with data syncing via CDC.
  2. Point non-critical dashboards at Snowflake first. Monitor for issues.
  3. Gradually migrate critical workloads. Keep Redshift as a fallback.
  4. Once all workloads are on Snowflake and stable for 1 week, decommission Redshift.
  5. Don't forget to cancel the Redshift Reserved Instance - we've seen teams pay for idle RI nodes for months after migration.

Key Takeaways


Frequently Asked Questions

Q: How long does a Redshift to Snowflake migration take?

Typical migrations take 8-16 weeks depending on the number of schemas, SQL complexity, and downstream dependencies. Schema translation and SQL conversion usually take 2-4 weeks. Data migration takes 1-2 weeks. Testing and validation take 3-6 weeks.

Q: Do I need to rewrite all my Redshift SQL for Snowflake?

Not all of it. Standard ANSI SQL works in both. But Redshift-specific syntax like DISTKEY, SORTKEY, DISTSTYLE, WLM queues, GETDATE(), CONVERT(), and certain LISTAGG behaviors need to be translated. Expect 15-30% of SQL statements to require changes.

Q: What is the best way to move data from Redshift to Snowflake?

UNLOAD from Redshift to S3 as Parquet files, then COPY INTO from S3 to Snowflake. This avoids data transfer fees and uses both platforms' native bulk load capabilities.

Q: Does Snowflake need DISTKEY and SORTKEY equivalents?

No. Snowflake uses automatic micro-partition pruning instead of distribution and sort keys. For very large tables (1TB+) with predictable query patterns, you can optionally add clustering keys, but most tables perform well without them.

Chakri, Cloud Solutions Architect

Chakri is a Cloud Solutions Architect at CelestInfo with hands-on experience across AWS, Azure, GCP, and Snowflake cloud infrastructure.

Related Articles

Burning Questions
About CelestInfo

Simple answers to make things clear.

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

Insights are updated in real-time as new data becomes available.

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.