How to create and read directories in Hadoop - Mapreduce Job working directory_问答_开发者

How to create and read directories in Hadoop - Mapreduce Job working directory

开发者 https://www.devze.com 2023-03-24 09:16 出处：网络

I want to create a directory inside the working directory of a MapRedu开发者_运维百科ce job in Hadoop.

相关专题：mapreduce

I want to create a directory inside the working directory of a MapRedu开发者_运维百科ce job in Hadoop.

For example by using: File setupFolder = new File(setupFolderName); setupFolder.mkdirs();

in my mapper class to write some intermediate files in it. Is it the right way to do it?.

Also after completion of the job how will I access this directory again if I wish so?

Please advice.

If you are using java, you can override the setup method and open the file handler there ( and close it in cleanup ) . This handle will be available to all mappers.

I am assuming that you are not writing all the map output here but some debug/stats. With this handler you can read and write as it is show in this example ( http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample )

if you want to read the whole directory, check out this example https://sites.google.com/site/hadoopandhive/home/how-to-read-all-files-in-a-directory-in-hdfs-using-hadoop-filesystem-api

remember that you will not be able to depend on the the order of data written to the files.

You can override setupReduce() in reducer class, use mkdirs() to create folder and use create() to create file for outputstream.

@Override
    protected void setupReduce(Context context) throws IOException {
        Configuration conf = context.getConfiguration();
        FileSystem fs = FileSystem.get(conf);
        fs.mkdirs(new Path("your_path_here"));
    }