Copying files to and from HDFS
To copy files from the local filesystem to HDFS, we can use copyFromLocal command and to copy files from HDFS to the local filesystem, we can use copyToLocal command.
hadoop fs -copyFromLocal <source_location_in_local filesystem><destination_location_in_HDFS>
hadoop fs -copyToLocal <source_location_in_HDFS><destination_location_in_local filesystem>
Let's connect to the node where we previously downloaded the u.data and u.item files in /home/ubuntu and copy the u.data file from the local filesystem to the new directory myNewDir in HDFS. In the terminal, type the following command (using relative path):
hadoop fs -copyFromLocal u.data myNewDir
or type this command (using absolute path):
hadoop fs -copyFromLocal /home/ubuntu/u.data /user/ubuntu/myNewDir
hadoop fs -ls myNewDir
The _hadoop fs -copyToLocal _command works in a similar way:
hadoop fs -copyToLocal myNewDir/u.data u.data.copy
ls
We can also use HDFS commands such as hadoop fs -cp or hadoop fs -mv to copy or move files within HDFS.
To copy a file:
hadoop fs -cp <source_location_in_HDFS><destination_location_in_HDFS>
To move a file:
hadoop fs -mv <source_location_in_HDFS><destination_location_in_HDFS>
For example, let's create 2 new directories in HDFS:
hadoop fs -mkdir myNewDir2
hadoop fs -mkdir myNewDir3
Copy the file u.data in myNewDir to myNewDir2 using:
hadoop fs -cp myNewDir/u.data myNewDir2
hadoop fs -ls myNewDir2
Move the file u.data in myNewDir to myNewDir3 using:
hadoop fs -mv myNewDir/u.data myNewDir3
hadoop fs -ls myNewDir
hadoop fs -ls myNewDir3