Cursor Follower
snowflake:SnowSQL Logo

Loading Semi-Structured Data from AWS S3

Celestinfo Software Solutions Pvt. Ltd. Jun 19, 2025

Introduction

Welcome to this step-by-step guide on loading unstructured JSON data from an AWS S3 bucket into Snowflake! If you’re new to Snowflake or data engineering, don’t worry—this blog will walk you through the process in a clear and approachable way. By the end, you’ll know how to set up an S3 integration, stage JSON data, parse it, and load it into both unstructured and structured tables in Snowflake. Let’s dive in!

Prerequisites

Before we start, ensure you have:

Step 1: Set up your Snowflake Environment


First, let’s create a database and schema to organize our work.

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

Step 2:Connect Snowflake to AWS S3

To access files in S3, we need a storage integration to securely link Snowflake with your S3 bucket.

S3 Service Dashboard
S3 Service Dashboard

After creating storage integration, update the value of ‘STORAGE_AWS_EXTERNAL_ID’ and ‘STORAGE_AWS_IAM_USER_ARN’.


Copy the values from the output, and update the trust policy in AWS IAM role. If you cant understand this step please go through our previous blog.

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

Step 3: Define a JSON File Format

Snowflake needs to know how to parse the JSON files. We’ll create a file format for JSON.

S3 Service Dashboard
S3 Service Dashboard

Step 4: Create an External Stage

An external stage points to your S3 bucket, linking it to Snowflake via the storage integration.

S3 Service Dashboard
S3 Service Dashboard

After creating the external stage, check the external stage has connected correctly or not. Just by listing, what file are there in the stage.

S3 Service Dashboard
S3 Service Dashboard

Step 5: Load JSON into an Unstructured Table

Let’s create a table with a VARIANT column to store the raw JSON data.

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

After copying the data from external stage, check if the data is loaded correctly.

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

Step 6: Parse JSON Data

Now, let’s query and parse the JSON data in UNSTRUCT_TBL. We’ll use Snowflake’s JSON parsing capabilities.

Simple Parsing with DOT Notation

If the JSON is a single object per row (not an array), use dot notation:

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

Note : while parsing, write the column names properly they are case sensitive.

Parsing Arrays with FLATTEN

If the JSON contains arrays (nested fileds) use the FLATTEN function

S3 Service Dashboard
S3 Service Dashboard
S3 Service Dashboard

  • FLATTEN expands nested arrays or object into rows.
  • TABLE(FLATTEN( )) processes each JSON object in UNS.

  • Step 7: Load JSON into a Structured Table

    To make the data more query-friendly, let’s load it into a structured table with defined columns.

    Create a structured Table

    Define a table matching the JSON schema.

    S3 Service Dashboard
    S3 Service Dashboard

    Insert Parsed JSON into the Structured Table

    Extract fields from UNSTRUCT_TBL and insert them into STUDENTS_DATA.

    S3 Service Dashboard

    Storage Note: Storing data in both UNSTRUCT_TBL and STUDENTS_DATA doubles storage costs. Consider dropping UNSTRUCT_TBL after loading if it’s no longer needed:

    S3 Service Dashboard

    Step 8: Load Directly into a Structured Table

    We can’t directly copy the JSON file from external stage to snowflake Sturcutre table. Start with creating another table to load the same file from S3 blog.

    S3 Service Dashboard
  • l Trying to copy data from external stage to snowflake sutuctured data.

  • S3 Service Dashboard
    S3 Service Dashboard
    S3 Service Dashboard
  • We need to copy the external stage data to internal snowflake managed table stage and copy the data to structure data. While copying the data from external stage, select all columns from the json data using table and flatten function.

  • S3 Service Dashboard
    S3 Service Dashboard
    S3 Service Dashboard
    S3 Service Dashboard
    S3 Service Dashboard

    Conclusion

    Let’s wrap it up!You’ve just taken a big step into the world of Snowflake by learning how to pull JSON data from an AWS S3 bucket and make it shine in structured tables.From setting up your S3 connection to parsing JSON, you’ve got the tools to turn messy data into something, grab your own JSON files, maybe try out Snowpipe for some automation flair.For more informations on Snowpipe, please visit our other blogs.

    Burning Questions
    About CelestInfo

    Simple answers to make things clear.

    How accurate are the AI insights?+

    Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

    Can I integrate with my existing tools?+

    Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

    What security measures do you have?+

    We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

    How often are insights updated?+

    Insights are updated in real-time as new data becomes available.

    What kind of support do you offer?+

    We offer 24/7 support via chat, email, and dedicated account managers.

    Still have questions?

    Ready? Let's Talk!

    Get expert insights and answers tailored to your business requirements and transformation.