开发者

Exception while executing hadoop job remotely

开发者 https://www.devze.com 2023-03-01 01:24 出处:网络
I am trying to execute a Hadoop job on a remote hadoop cluster. Below is my code. Configuration conf = new Configuration();

I am trying to execute a Hadoop job on a remote hadoop cluster. Below is my code.

Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://server:9000/");
conf.set("hadoop.job.ugi", "username");

Job job = new Job(conf, "Percentil Ranking");
job.setJarByClass(PercentileDriver.class);
job.setMapperClass(PercentileMapper.class);
job.setReducerClass(PercentileReducer.class);
job.setMapOutputKeyClass(TestKey.class);
job.setMapOutputValueClass(TestData.class);
job.setOutputKeyClass(TestKey.class);
job.setOutputValueClass(BaselineData.class);

job.setOutputFormatClass(SequenceFileOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(inputPath));

FileOutputFormat.setOutputPath(job, new Path(outputPath));

job.waitFo开发者_JAVA百科rCompletion(true);

When the job starts executing immediately an exception is thrown before even the map phase.

java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1216)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1197)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:92)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:373)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:800)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)

The input file does exist and is a comma separated text file. I am able to execute the job on the hadoop cluster using the hadoop jar command with the same input and output. But I can't run it remotely. I am also able to run other jobs remotely.

Can anyone tell me what is the solution to this problem?


It seems conf.set("mapred.job.tracker", "server:9001"); fixed the issue. Thanks for your help.


You do this:

conf.set("fs.default.name", "serverurl");

So you are setting the filesystem to the value "serverurl"... which is meaningless.

I'm pretty sure that it works when you simply remove that line from your code.

HTH

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号