We have a custom input format extending the FileInputFormat, which generates a separate split for each line in the input file. This file provides a host name in which the mapper handling this line should run.
How do I achieve this?
This is needed as the mapper reads data from DB a开发者_StackOverflownd I want to run the mapper in the same machine as the DB server.
Not possible without writing your own implementation within the Hadoop code base.
If you are trying to add more data to the map input then pass it in as an argument to the job and you can then have it in your map() and concatenate it with the input.
精彩评论