Whenever I am trying to use Java class files as my mapper and/or reducer I am getting the following error:
java.io.IOException: Cannot run program "MapperTst.class": java.io.IOException: error=2, No such file or directory
I executed the following command on the terminal:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -file /home/hadoop/codes/MapperTst.class开发者_高级运维 -mapper /home/hadoop/codes/MapperTst.class -file /home/hadoop/codes/ReducerTst.class -reducer /home/hadoop/codes/ReducerTst.class -input gutenberg/* -output gutenberg-outputtstch27
Assuming your fully qualified Mapper class name (including the package) is codes.MapperTest and the reducer class name is codes.ReducerTst,
Package your Map and reduce classes into a jar file say /home/hadoop/test.jar Your command should work if you modify it to :
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar \ contrib/streaming/hadoop-streaming-0.20.203.0.jar \ -libjars /home/hadoop/test.jar \ -mapper codes.MapperTst \ -reducer codes.ReducerTst \ -input gutenberg/* -output gutenberg-outputtstch27
I had the same problem. The solution was for me to put the java mapper/reducer/combiner in a specified package. With the default package it won't work. It will give you the error you had.
Streaming is not supposed to work with Java classes. It is supposed to run anything which can be treated as a linux command. Input data will be fed into the input stream and output will be treated as an mapper output. If you already have mapper class in java - you do not need streaming.
精彩评论