Replication factor in HDFS

We have seen that HDFS supports block replication to provide fault tolerance and availability. By default HDFS replicates a block to 3 nodes.

Open the terminal, connect to the main node and list the content of the myNewDir2 directory using:

hadoop fs -ls myNewDir2

The number 3 after the right permissions is actually the replication factor for the file u.data. It means that each block of this file will be replicated 3 times in the cluster.

We can change the replication factor using the dfs.replication property. In the terminal, type in the following command:

hadoop fs -Ddfs.replication=2 -cp myNewDir2/u.data myNewDir/u.data.rep2
hadoop fs -ls myNewDir

results matching ""

    No results matching ""