ITWissen.info - Tech know how online

Hadoop distributed file system (HDFS)

Hadoop Distributed File System ( HDFS) is a distributed file system with a high fault tolerance to hardware failures. HDFS operates in master- slave mode and stores metadata on the master node, the NameNode.

The file system metadata is managed as directory structures. The user data is stored in the slaves, which are called DataNodes. The NameNode processes the incoming queries and organizes the storage of the files in the slaves.

In the concept of HDFS, the read operation is optimized because it is assumed that files are written only once but read frequently. The read operation is therefore more efficient than writing the files.

To increase resilience, HDFS works by replicating the payload data, which is divided into blocks of equal size and stored by the NameNode on multiple DataNodes. By default, the data blocks are replicated three times, and if one node fails, the corresponding data block is retrieved by the NameNode from another cluster. To avoid data loss, no file may be stored on only one node.

Informations:
Englisch: Hadoop distributed file system - HDFS
Updated at: 14.10.2016
#Words: 166
Links: Hadoop, high density fixed service (HDFS), distributed file system (DFS), fault tolerance (FT), hardware (HW)
Translations: DE
Sharing:    

All rights reserved DATACOM Buchverlag GmbH © 2024