开发者

Hadoop read from standard input stream

开发者 https://www.devze.com 2023-03-23 09:01 出处:网络
I want my MapReduce program to read from the standard input strea开发者_StackOverflowm (System.in)

I want my MapReduce program to read from the standard input strea开发者_StackOverflowm (System.in) For example in the run() method, how can I make my program read from System.in instead of a file like this..FileInputFormat.addInputPath(job, new Path("dummy.txt"));

Also what class should I set for the job.setInputFormat(...)


Use Hadoop Streaming to do this:

http://wiki.apache.org/hadoop/HadoopStreaming

Supports stdin, stdout


I have not seen such InputFormat present in hadoop. Probably you will have to write System.in somewhere from time to time and run hadoop job over the saved content eveytime you get new one.

Such situation is common while using hadoop for processing log files which are generated/populated continuously. In such use case its wise to get the log file(s) on daily or weekly basis and run the hadoop job over it once you obtain it.

0

精彩评论

暂无评论...
验证码 换一张
取 消