开发者

Pass file location as value to hadoop mapper?

开发者 https://www.devze.com 2023-02-13 06:15 出处:网络
Is it possible to pass the locations of a files in HDFS as the value to my mapper so that i can ru开发者_JAVA技巧n an executable on them to process them?yes, you can create file with file names in the

Is it possible to pass the locations of a files in HDFS as the value to my mapper so that i can ru开发者_JAVA技巧n an executable on them to process them?


yes, you can create file with file names in the HDFS, and use it as an input for the map/reduce job. You will need to create custom splitter, in order to serve several file names to each mapper. By default you input file will be split by the blocks, and probabbly the whole file list will be passed to one mapper.
Another solution will be to define Your input as not splittable. In this case each file will be passed to the mapper, and you free to create your own InputFormat which will use whenever logic you need to process the file - for example call external executable. If you will go this way the Hadoop framework will take care about data locality.


The another of approaching this can be by obtaining the file name through FileSplit, thos can done by using the following code:

     FileSplit fileSplit = (FileSplit) context.getInputSplit();
 String filename = fileSplit.getPath().getName(); 

Hope this helps

0

精彩评论

暂无评论...
验证码 换一张
取 消