Hadoop Distributed File System :
In Hadoop data resides in a distributed file system which is called a Hadoop Distributed File system.
HDFS splits files into blocks and sends them across various nodes in form of large clusters.
The Hadoop Distributed File System (HDFS) is based on the Google File System (GFS) and provides a distributed file system that is designed to run on commodity hardware.
Commodity hardware is cheap and widely available, these are useful for achieving greater computational power at a low cost.
It is highly fault-tolerant and is designed to be deployed on low-cost hardware.
It provides high throughput access to application data and is suitable for applications having large datasets.
Hadoop framework includes the following two modules :
Hadoop Common: These are Java libraries and utilities required by other Hadoop modules.
Hadoop YARN: This is a framework for job scheduling and cluster resource management.
0 Comments