开发者

Hadoop copy a directory?

开发者 https://www.devze.com 2023-02-05 17:02 出处:网络
Is there an HDFS API that can copy an entire local d开发者_StackOverflowirectory to the HDFS? I found an API for copying files but is there one for directories?Use the Hadoop FS shell. Specifically:

Is there an HDFS API that can copy an entire local d开发者_StackOverflowirectory to the HDFS? I found an API for copying files but is there one for directories?


Use the Hadoop FS shell. Specifically:

$ hadoop fs -copyFromLocal /path/to/local hdfs:///path/to/hdfs

If you want to do it programmatically, create two FileSystems (one Local and one HDFS) and use the FileUtil class


I tried copying from the directory using

/hadoop/core/bin/hadoop fs -copyFromLocal /home/grad04/lopez/TPCDSkew/ /export/hadoop1/lopez/Join/TPCDSkew

It gave me an error saying Target is a directory . I then modified it to

/hadoop/core/bin/hadoop fs -copyFromLocal /home/grad04/lopez/TPCDSkew/*.* /export/hadoop1/lopez/Join/TPCDSkew

it works .


In Hadoop version:

Hadoop 2.4.0.2.1.1.0-390

(And probably later; I have only tested this specific version as it is the one I have)

You can copy entire directories recursively without any special notation using copyFromLocal e.g.,:

hadoop fs -copyFromLocal /path/on/disk /path/on/hdfs

which works even when /path/on/disk is a directory containing subdirectories and files.


You can also use the put command:

$ hadoop fs -put /local/path hdfs:/path


For programmer, you also can use copyFromLocalFile. Here is an example:

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path

val hdfsConfig = new Configuration
val hdfsURI = "hdfs://127.0.0.1:9000/hdfsData"
val hdfs = FileSystem.get(new URI(hdfsURI), hdfsConfig)

val oriPath = new Path("#your_localpath/customer.csv")
val targetFile = new Path("hdfs://your_hdfspath/customer.csv")
hdfs.copyFromLocalFile(oriPath, targetFile)
0

精彩评论

暂无评论...
验证码 换一张
取 消