开发者

MapReduceBase and Mapper deprecated

开发者 https://www.devze.com 2023-04-10 16:31 出处:网络
public static class Map extends MapReduceBase i开发者_开发技巧mplements Mapper MapReduceBase, Mapper and JobConf are deprecated in Hadoop 0.20.203.
public static class Map extends MapReduceBase i开发者_开发技巧mplements Mapper

MapReduceBase, Mapper and JobConf are deprecated in Hadoop 0.20.203.

What should we use now?

Edit 1 - for the Mapper and the MapReduceBase, I found that we just need to extends the Mapper

public static class Map extends Mapper
            <LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();

  public void map(LongWritable key, Text value, 
         OutputCollector<Text, IntWritable> output, 
         Reporter reporter) throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
      word.set(tokenizer.nextToken());
      output.collect(word, one);
    }
  }
}

Edit 2 - For JobConf we should use configuration like this:

public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setMapperClass(WordCount.Map.class);
    }

Edit 3 - I found a good tutorial according to the new API : http://sonerbalkir.blogspot.com/2010/01/new-hadoop-api-020x.html


Javadoc contains info what to use instaed of this depraceated classes:

e.g. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/JobConf.html

 Deprecated. Use Configuration instead

Edit: When you use maven and open class declaration (F3) maven can automatically download source code and you'll see content of javadoc comments with explanations.


There is not much different functionality wise between the old and the new API, except that the old API supports push to the map/reduce functions, while the new API supports both push and pull API. Although, the new API is much cleaner and easy to evolve.

Here is the JIRA for the introduction of the new API. Also, the old API has been un-deprecated in 0.21 and will be deprecated in release 0.22 or 0.23.

You can find more information about the new API or sometimes called the 'context objects' here and here.

0

精彩评论

暂无评论...
验证码 换一张
取 消