Sunday, March 9, 2025

Hadoop/Hive

The term bigdata is used to refer large volumes of data. Using the traditional way to handle this data is very difficult. So, the Apache software has created a framework called hadoop to handle these problems.

Hadoop is the open-source framework to stored and process bigdata. It contains two modules Hadoop Distributed file system (HDFS) and MapReduce  

MapReduce: It is used to process the large volumes or structured, semi-structured and unstructured data.

HDFS : It is used to store the data.

There are certain tools like Sqoop, Pig, and Hive helps hadoop modules.

Hive is not a Relational data base. It is query engine.

Import Command is used to import data from DBMS to hive and Exports command is used to export data into specific location.



No comments:

Post a Comment

Different types of indexes in SQL Server and Oracle

* What are the different types of indexes in Sql server and Oracle? Identify the differences. In SQL Server and Oracle, indexes are crucial ...