Wednesday, September 3, 2025

Snowflake Database

 CREATING A WAREHOUSE :

---------------------------------------

CREATE OR REPLACE WAREHOUSE FIRST_WAREHOUSE

WITH

WAREHOUSE_SIZE = XSMALL

AUTO_SUSPEND = 300

SCALING_POLICY = 'Economy'

MIN_CLUSTER_COUNT = 1

MAX_CLUSTER_COUNT =3

INITIALLY_SUSPENDED = TRUE

COMMENT = "This is our first warehouse"



LOADING DATA :

--------------

* BULK LOADING:

- It uses our own databases and warehouses

- It uses copy command

- Transformations


* CONTINIOUS LOADING:

- Small volumes of data

- Serverless 



STAGES :

--------

* EXTERNAL STORAGE

- External cloud provider

- S3 bucket

- Google cloud platform

- Microsoft Azure

- Database Object created in schema


* INTERNAL STORAGE

- Local storage is maintained by snowflake



PERFORMANCE TUNING :

--------------------

Most of the performance tuning is done by snowflake automatically.


* Dedicated warehouses 

- To DBA, Data Scientists and Analysts


* Scaling up

- Increasing warehouses size


* Scaling down

- Adding warehouses/clusters


* Caching

- Automated process when query is executed


* Clustering

- Its automatically maintained by snowflake



* What is a Snowpipe?

- Snowpipe is Snowflake's continuous, fully managed data ingestion service.

- It automatically loads data into Snowflake from cloud storage services like Amazon S3, Azure Blob Storage, and Google Cloud Storage


  Step of Setting up snowpipe :

  -----------------------------

- Create Stage

- Copy Command

- Create Pipe

- S3 Notification



* What is Time travel?

- If the data or database is deleted or dropped then historical data is queried using time travel.

- Restored items like Query deleted, updated data, restored tables, schemas, databases, clones of tables.

 

  eg: Select * from table AT (TIMESTAMP => timestamp)

  Select * from table AT (OFFSET => -10*60)

  Select * from table BEFORE (STATEMENT => query_id)

  Undrop table table_name

  Undrop schema schema_name

  Undrop Database database_name

  

- Standard edition can travel back 1 day to the past whereas Enterprise, Business Critical and Virtual Private edition can travel back 90 days to the past. 

  It's also called as retention period.

- Data loss can happen when using time travel when you execute it 2nd time.



* What is fail safe?

- It is used to recover historical data when there is any disaster.

- It's not configurable 7 day period for permanent tables.

- This state starts when the time travel ends.

- We need to reach snowflake to recoved it as we can't do anything on it as users.



DIFFERENT TYPES OF TABLES :

---------------------------

* Permanent tables

- They are created using create table command.

- Time travel retention period exists here.

- Fail safe exists here.


* Transient tables

- They are created using create transient table command.

- Time travel retention period exists here.


* Temporary tables

- They are created using create temporary table command.

- Time travel retention period exists here.

- They only exists in current session.


External Tables, Hybrid Tables, and Iceberg Tables



CLONING TABLES :

----------------

* Zero copy cloning : It means data will not be copied in this cloning.

- It's used to clone datbases, schemas, tables, stage, pipe, stream and tasks. 

- All the child objects will be clone but not internal stages.

- It an independent table and will reference original table.

- It will not be effected when original tables is modified.

- Select previlage is enough for Tables

- Owner previlages are required for pipe, stream and tasks.  

- Usage previlages for other objects

- Load history metadata is not copied.

- Cloning for specific point in time is possible.

- It's not possible to create temporary table into permanent table.


  eg: Create table tablename clone table_source



SWAPPING TABLES :

-----------------

- It is used to take development code into production code.

- Metadata swapping is used here.

- It can be done on tables and schemas.


  eg: Alter table table_name swap with target_tablename




DATA SHARING :

--------------

- Here provider Account_1 is accessed by consumer Account_2 and consumer can copy the data.

- Provider stores data with his storage Account_1 and


  eg: GRANT USAGE ON DATABASE DATA_S TO SHARE ORDERS_SHARE; 

      GRANT USAGE ON SCHEMA DATA_S.PUBLIC TO SHARE ORDERS_SHARE; 

      GRANT SELECT ON TABLE DATA_S.PUBLIC.ORDERS TO SHARE ORDERS_SHARE; 

  

- It can be shared to non-snowflake users with accountadmin account.




DATA SAMPLING :

---------------

- Instead of taking large amount data only some of the sample data is taken in data sampling.

- It is useful for query development or data analysis

- There are 2 meathods of sampling Row or Bernouli and Block or System meathods.

- 50% of data rows are selected in Row or Bernouli meathods

- 50% of data block are selected in Block or System meathods.


DYNAMIC DATA MASKING :

----------------------

- Data masking is applied on sensitive data as additional layer of security.

- This is column level security by specifying some columns for masking

- This rules can be assigned to specific roles.

- In order to apply masking create masking policy and then alter table and apply masking policy to it.


  eg: Create or replace masking policy mask_phone_number

      AS (val varchar) Returns varchar

      CASE WHEN Current_Role IN ('','')

  THEN Val

  ELSE '##-###-## '

  END;


Alter table If exists customer modify column phone

SET MASKING POLICY mask_phone_number

  


  

  













Snowflake Database

 CREATING A WAREHOUSE : --------------------------------------- CREATE OR REPLACE WAREHOUSE FIRST_WAREHOUSE WITH WAREHOUSE_SIZE = XSMALL AUT...