Tuesday, November 11, 2025

Teradata Database

  How Teradata is different from other database?

- Parallelism : Massive Parallel Processing(MPP) Architecture. It does parallel processing of data massive data.

- Shared Nothing : If anyone one node has failed then other node has backup and can we make use of it. It's called fault tolerance.

- Linear Scalability: When we need extra space we can go for it by scaling up to 4096 node. 



 Different utilities :

 ---------------------

* Import/Export data from and to Teradata  

* Fast load : If you want to load data from flat file to teradata fast load utility is used.

* Multi load : If you want to load multiple tables at a time we use Multi load.

* Tpump : Used to load data into teradata.

* Fast export : Used for exporting data from database. 



 Teradata Architecture :

 -----------------------

Teradata is used for massive parllel processing(MPP). There are 4 different components of teradata i,e Node, Parsing engine, Message passing unit or Bynet, Access module processor(AMP).


Parsing engine : It like gate keeper. 

- Whenever you want to store and fetch data from database its checks your authentication.

- It converts you sql query into machine readable language and parse the query.

- It will store query in access path and it whenever you are using same query.


Bynet : It means BanYan NETwork.

- BYNET is the communication highway inside Teradata that makes parallel processing possible.

- Teradata has many AMPs (workers) that process data in parallel. These AMPs and other parts (PEs, nodes) need a way to talk to each other.

  That’s what BYNET is — the network inside Teradata. 

- It delivers messages (like SQL requests and results) between the PEs (who receive your query) and the AMPs (who store and process the data).

- There are always two BYNETs (BYNET 0 and BYNET 1).If one goes down, the other takes over — so your queries don’t stop.


Amp : It's called access module processor. (virtual disk)

- An AMP is not hardware. It’s a software process running inside a Teradata node.

- Each AMP is tied to its own slice of disk storage. Teradata may have hundreds of AMPs → that’s how it divides and conquers big data.

- It is a logical process (not a physical machine) that Stores a portion of the table rows (on disk), Processes queries on that portion (like filtering, 

  joining, aggregating) and Returns results to the Parsing Engine (through BYNET).


Disk Storage : 

- Disk Storage in Teradata is just the place where your actual data (tables, rows, indexes) is physically kept.

- Every AMP has its own disk space. So when data is loaded, Teradata spreads rows across different AMPs → which means across different disks.

- When you query, each AMP goes to its own disk storage, pulls the data, and processes it 




 Primary Index :

 ---------------

- There should be one Primary index on table. 

- There 2 indexes. It is unique primary index or non-unique primary index

- Primary index can be null value

- Primary index can be modified but populated table cannot be modified.

- Primary index has limit of 64 coulumns combinations.

 

 

 INDEXES :

 ---------

- unique primary index : 

- unique Secondary index :

- Partitioned primary index : It is used for distribution of rows based on different partitions so that data retrival will be faster.



 RAID : (Redundant Array of Independent Disks)

 ------

- Data protection failure happens on different levels Disk level, Node level, Amp level, Transaction level and Object level.

- We have raid 1 and raid 5 for disk mirroring if the case of any failure happens.


 

 DIFFERENT SPACES :

 ------------------

Permanent space

- It's physical space

- Objects like tables and users are stored here. 


Spool space :

- Used for immediate calculations, subqueries and joins.


Temporary Space :

- They are stored only during the duration of the query.



 

 

 

=================================================================================================================================================




Teradata supports different types of tables :-

---------------------------------------------


Permanent table : This is a default table, it contains inserted data and stored data permanently.


Volatile table : The data inserted into a volatile table is retained only during the user session. Table and data is dropped at the end of the session. These tables are used to hold the data during data transmission.


Global temporary table : Table is delete at the end of user session.


Derived table :  Derived table holds the intermediate results in a query. Their lifetime is within the query in which they are created, used and dropped.



Set Versus Multiset :-

---------------------

Teradata classifies the tables as SET or MULTISET tables based on how the duplicate records are handled. A table defined as SET table doesn’t store the duplicate records, whereas the MULTISET table can store duplicate records.


No comments:

Post a Comment

Teradata Database

  How Teradata is different from other database? - Parallelism : Massive Parallel Processing(MPP) Architecture. It does parallel processi...