开发者

Get the input path in a Hadoop Mapper Class

开发者 https://www.devze.com 2023-02-15 02:39 出处:网络
I have implemented a simple MapReduce project in Hadoop for processing logs. The input path is the directory where the logs are.

I have implemented a simple MapReduce project in Hadoop for processing logs. The input path is the directory where the logs are.

It works fine but I would like to know how the input path of the log is being processed at any time in the class which implements the Mapper. The Mapper code is:

public class StatsMapper extends MapReduceBase implements Mapper<WritableComparable<Text>,Text,Text,Text> { 

    public static final Log LOG = LogFactory.getLog(StatsMapper.class);

    public void configure(JobConf conf) {}

    public void map(WritableComparable<Text> key, Text 开发者_开发知识库value, OutputCollector<Text,Text> output, Reporter reporter)
            throws IOException {

        process(key,value);

    }

}

Any idea?

Thanks in advance


Read the InputFormat section here

How these input files are split up and read is defined by the InputFormat. An InputFormat is a class that provides the following functionality: Selects the files or other objects that should be used for input Defines the InputSplits that break a file into tasks Provides a factory for RecordReader objects that read the file

0

精彩评论

暂无评论...
验证码 换一张
取 消