HDFS edit
A highly resilient distributed cluster file system proven at extreme scale. Consists of a single NameNode service (that's responsible for all metadata management, including the filesystem namespace and block management) plus DataNode services that run on all storage nodes (that manage block IO). Supports NameNode high availability, metadata resilience (via a transaction log), data resilience (via block replication or erasure coding), user authentication, extended ACLs, snapshots, quotas, central caching, a REST API, an NFS gateway, rolling upgrades, rack awareness, transparent encryption, NameNode federation (support for multiple independant NameNodes on the same cluster serving different namespaces) and support for heterogeneous storage. Part of the original Hadoop code base, becoming an Apache Hadoop sub-project in July 2009. Currently being updated to run over the new HDDS (Hadoop Distributed Data Storage) layer, moving block management from the NameNode to a new Storage Container Manager to increase scalability. Technology Information
Other Names Hadoop Distributed File System Type Sub-Project Parent Project Apache Hadoop Last Updated September 2018 HDFS Ecosystem
Links
Blog Posts