Cloud Migration for Data Workloads: A Realistic Timeline and What Nobody Tells You
Quick answer: A small data warehouse migration (<1TB) realistically takes 6–10 weeks. Mid-size (1–10TB) takes 3–6 months. Large enterprise takes 12–18 months. The technical migration is roughly 40% of the effort - the rest is assessment, validation, stored procedure conversion, parallel running, and training your analysts on new tools. Start with read-only analytics workloads, not production transactional systems.
Last updated: December 2025
The 6 R's Applied to Data Workloads
A realistic data workload cloud migration takes 3-6 months for a mid-size warehouse (under 10TB) and 6-18 months for enterprise (10TB+). The critical path is: assess current workloads (2-4 weeks), design target architecture (2-3 weeks), migrate in waves starting with non-production (8-16 weeks), then cutover with parallel running (2-4 weeks). Below is the actual timeline breakdown with the 6 R’s applied specifically to data infrastructure.
Rehost (Lift and Shift)
Move the database to cloud VMs with minimal changes. Sounds simple. Rarely works well. An Oracle instance tuned for SAN storage with specific I/O patterns doesn't perform the same on EBS volumes. You'll spend weeks re-tuning parameters, and you'll still pay Oracle licensing on top of cloud compute costs. Use rehosting only as a temporary step to get off physical hardware while you plan a real migration.
Replatform
Move to a managed service with modest changes: on-prem PostgreSQL to Amazon RDS, or SQL Server to Azure SQL Managed Instance. This preserves most of your existing code while eliminating infrastructure management. It's the pragmatic middle ground for workloads where a full rewrite isn't justified.
Refactor
Redesign for cloud-native architecture. Moving from Oracle to Snowflake or Azure Synapse falls here. This is the most work upfront but delivers the biggest long-term benefits: elastic compute, separation of storage and compute, and consumption-based pricing. Budget 30% of migration time for stored procedure conversion alone.
Repurchase
Replace with a SaaS product. If your on-prem reporting server can be replaced by a cloud BI tool, do it. Don't migrate what you can replace.
Retain
Leave it where it is. Some legacy systems have regulatory constraints, are scheduled for decommission in 12 months, or have such complex dependencies that migration cost exceeds benefit. That's fine. Not everything needs to move.
Retire
Turn it off. During assessment, you'll discover databases nobody uses, reports nobody reads, and pipelines feeding dashboards that were abandoned 2 years ago. Every retired workload is one you don't have to migrate, test, or pay for.
Why "Lift and Shift" for Databases Rarely Works
When someone says "let's just move it to the cloud," they're imagining a forklift. Databases don't work like that. On-prem databases are tuned for specific hardware: SAN storage latency profiles, local SSD IOPS characteristics, specific CPU cache sizes, and dedicated network bandwidth. Change any of those variables, and query performance changes unpredictably.
We've seen Oracle instances that ran sub-second queries on-prem suddenly take 8-12 seconds on EC2 because the I/O patterns that were fast on SAN became slow on gp3 EBS. The indexing strategy was optimized for hardware that no longer exists. It's not a bug - it's physics.
Realistic Timelines (With Honest Buffer)
- Small warehouse (<1TB, <50 tables, no stored procedures): 6–10 weeks
- Mid-size (1–10TB, 50–500 tables, some SPs): 3–6 months
- Large enterprise (10TB+, 500+ tables, complex SP ecosystem): 12–18 months
Anyone promising a large enterprise migration in 3 months is either scoping only a subset of workloads (which is a valid strategy) or cutting corners on validation (which isn't). The timeline breakdown is roughly: assessment (15%), schema migration and pipeline builds (25%), stored procedure conversion (20%), data migration and validation (25%), parallel run and cutover (15%).
The Assessment Phase Nobody Wants to Do
This is the phase that gets skipped or rushed, and it's exactly the phase that prevents disasters later. You need to:
- Inventory all data sources. Not just the ones in the warehouse - the spreadsheets on shared drives, the API feeds that run on someone's laptop, the FTP drops from vendors. You can't migrate what you don't know about.
- Map all dependencies. Which reports use which tables? Which pipelines feed which dashboards? If you migrate Table A but Table B still depends on an on-prem join, your pipeline breaks.
- Identify the 20% that gets 80% of usage. Run query logs for 30 days. You'll find that 20% of your reports account for 80% of all queries. These are your critical-path workloads - validate them first, test them hardest.
- Document stored procedures and their business logic. SPs are where migration projects go to die. Many were written 5-10 years ago by people who've left. They contain business logic that isn't documented anywhere else.
Data Validation Strategy
Validation is what separates a migration from a disaster. You need three layers:
1. Automated Row Counts
Every table, every day during migration. Source count must equal target count. Sounds basic, but if you skip this, you'll discover missing data 3 months after cutover when someone's quarterly report doesn't match.
2. Hash Comparisons
Compute checksums on critical columns (revenue, quantities, dates) and compare source vs. target. Row counts can match while data is corrupted - a column that truncated timestamps to dates will have the right count but wrong values.
3. Statistical Sampling
For large tables where full comparison is too expensive, sample 1-5% of rows and do detailed field-by-field comparison. Focus on edge cases: NULL handling, unicode characters, numeric precision, and timezone conversions - these are where silent data corruption hides.
The Parallel-Run Period
Run old and new systems simultaneously for 2–4 weeks minimum. Compare outputs daily. This isn't optional - it's how you catch issues that automated validation misses, like a report that's technically "correct" but renders differently because of sort order changes or floating-point precision differences.
Yes, you're paying double during this period. Budget for it. A 3-week parallel run at $15K/week in cloud costs is cheap insurance against a failed migration that costs 6 months to unwind.
Organizational Change Management
Here's the part vendors never talk about: training analysts on new tools is harder than the technical migration. Your SQL analysts who've written Oracle-specific SQL for 8 years now need to learn Snowflake SQL. Your report builders who know SSRS inside-out now need to learn Looker or Power BI. Your DBAs who managed index tuning now need to understand warehouse sizing and compute workload management.
Budget 2-4 weeks of dedicated training time per analyst. Not "watch these videos" training - hands-on workshops where they rebuild their actual reports on the new platform. If you don't invest here, people will build workarounds, export to Excel, and your expensive cloud warehouse becomes a very overpriced file server.
Cost Surprises That Hit Every Migration
- Egress charges. Moving data out of AWS costs $0.09/GB after the first 100GB/month. A 5TB initial transfer plus ongoing replication adds up fast.
- License true-ups. Your current database vendor will audit you right when they learn you're leaving. Budget for 3-6 months of overlap licensing.
- Double-running costs. During parallel run, you're paying for both systems. For a mid-size warehouse, expect $10-20K/month in additional cloud costs during this period.
- Consultant costs for SP conversion. Complex stored procedures that took your team 5 years to build don't convert themselves. Third-party conversion tools handle 60-70% of the syntax; the rest is manual rework.
What to Migrate First
Start with analytics and BI workloads. They're read-only, low-risk, and let your team learn the new platform without endangering transactional systems. If a dashboard runs slow for a day, nobody loses money. If an order processing pipeline breaks, every minute costs revenue.
Migration order: (1) analytics/BI workloads, (2) batch ETL pipelines, (3) reporting databases, (4) operational data stores, (5) transactional systems. Each phase builds confidence and reveals issues before they matter in high-stakes systems.
The Stored Procedure Gotcha
Stored procedures written for Oracle PL/SQL don't translate 1:1 to Snowflake JavaScript or Snowpark Python. Budget 30% of your total migration time for SP conversion. Some PL/SQL features (autonomous transactions, nested cursors with bulk collect, DBMS_SCHEDULER) have no direct equivalent and require architectural rethinking. Start SP analysis in the assessment phase, not after you've committed to a timeline.
Key Takeaways
- Lift-and-shift rarely works for databases. On-prem performance tuning doesn't transfer to cloud hardware.
- Realistic timelines: 6–10 weeks (small), 3–6 months (mid), 12–18 months (enterprise). Anyone promising faster is cutting validation corners.
- The assessment phase prevents disasters. Inventory sources, map dependencies, identify the 20% of reports that get 80% of usage.
- Validate with three layers: row counts, hash comparisons, and statistical sampling. Run parallel for 2–4 weeks minimum.
- Budget 20–30% above estimates for egress charges, double-running costs, and license true-ups.
- Start with read-only analytics workloads, not transactional systems. Build confidence before raising stakes.
Related Articles
Frequently Asked Questions
How long does a cloud data warehouse migration actually take?
Realistic timelines: small warehouse (under 1TB, fewer than 50 tables) takes 6-10 weeks. Mid-size (1-10TB, 50-500 tables) takes 3-6 months. Large enterprise (10TB+, 500+ tables, complex stored procedures) takes 12-18 months. Anyone promising significantly faster timelines is either cutting corners on validation or hasn't accounted for stored procedure conversion.
Why doesn't lift-and-shift work for databases?
On-prem databases are tuned for specific hardware: SAN storage, local SSDs, specific CPU architectures, and network configurations. An Oracle database tuned for SAN storage won't perform the same on EBS volumes. Query plans, indexing strategies, and stored procedure performance all change when the underlying compute and storage characteristics change.
What should I migrate first to the cloud?
Start with analytics and BI workloads. They're read-only, low-risk, and let your team learn the new platform without endangering production transactional systems. If a dashboard runs slow for a day, nobody loses money. If an order processing pipeline breaks, every minute costs revenue.
What are the hidden costs of cloud data migration?
The three biggest cost surprises are: egress charges (moving data out of one cloud or between regions), double-running costs during the parallel-run period (you're paying for both old and new systems for 2-4 weeks minimum), and license true-ups when your existing vendor learns you're leaving. Budget 20-30% above your estimated costs for these surprises.