开发者

How to call Partitioner in Haoop v 0.21

开发者 https://www.devze.com 2023-03-06 18:58 出处:网络
In my application I want to create as many reducer jobs as possible based on the keys. Now my current implementation writes all the keys and values in a single (reducer) output file. So to solve this,

In my application I want to create as many reducer jobs as possible based on the keys. Now my current implementation writes all the keys and values in a single (reducer) output file. So to solve this, I have used one partitioner but I cannot call the class.The partitioner should be called after the selection Map task and before the selection reduce task but it did not.The code of the partitioner is the following

public class MultiWayJoinPartitioner extends Partitioner<Text, Text> {
@Override
public int getPartition(Text key, Text value, int nbPartitions) {
return (key.getFirst().hashCode() & Integer.MAX_VALUE) % nbPartitions;
return 0;
}
}

Is this code is correct to partition the files based on the keys and values and the output will b开发者_开发问答e transfer to the reducer automatically??


You don't show all of your code, but there is usually a class (called the "Job" or "MR" class) that configures the mapper, reducer, partitioner, etc. and then actually submits the job to hadoop. In this class you will have a job configuration object that has many properties, one of which is the number of reducers. Set this property to whatever number your hadoop configuration can handle.

Once the job is configured with a given number of reducers, that number will be passed into your partition (which looks correct, by the way). Your partitioner will start returning the appropriate reducer/partition for the key/value pair. That's how you get as many reducers as possible.

0

精彩评论

暂无评论...
验证码 换一张
取 消