开发者

Adding multiple files to Hadoop distributed cache?

开发者 https://www.devze.com 2023-01-14 01:39 出处:网络
I am trying to add multiple files to hadoop distributed cache. Actuall开发者_如何学Goy I don\'t know the file names. They will be named like part-0000*. Can someone tell me how to do that?

I am trying to add multiple files to hadoop distributed cache. Actuall开发者_如何学Goy I don't know the file names. They will be named like part-0000*. Can someone tell me how to do that?

Thanks Bala


You can use either the hadoop -put or -copyFromLocal command:

hadoop fs -copyFromLocal /home/hadoop/outgoing/* /your/hadoop/dir


I solved this problem although it maybe a bit late:

FileSystem fs = directoryPath.getFileSystem(getConf());
FileStatus[] fileStatus = fs.listStatus(directoryPath);
for (FileStatus status : fileStatus) {
    DistributedCache.addFileToClassPath(status.getPath(), conf);
}

Is this what you wanted to do?


Nothing prevents you from programmatically getting the list of files if they all are in one directory and the adding them one by one, right? Or is your case different?

0

精彩评论

暂无评论...
验证码 换一张
取 消