I'm trying to set up hadoop and nutch to run on EC2. To get started, I have followed the excellent NutchHadoopTutorial. Most everything works as it should, except that I am unable to access any of the web interfaces (e.g. JobTracker). The J开发者_如何学JAVAobTracker starts without errors, and I can hit nutch-master:50030
, however I'm getting what looks like jetty's default servlet, which returns a link to the webapps directory, and then from there a job directory, and then a link to nutch-master:50030/webapps/job/jobtracker.jsp
-- which in turn returns a 404 for RequestURI=/webapps/job/jobtracker.jsp
. I've checked the classpath, and everything that is supposed to be there is in fact available:
/usr/lib/jvm/java-6-openjdk/bin/java -Xmx1000m -Dhadoop.log.dir=/nutch/search/logs -Dhadoop.log.file=hadoop-nutch-jobtracker-nutch-master.log -Dhadoop.home.dir=/nutch/search -Dhadoop.id.str=nutch -Dhadoop.root.logger=INFO,DRFA -Djava.library.path=/nutch/search/lib/native/Linux-i386-32 -Dhadoop.policy.file=hadoop-policy.xml -classpath /nutch/search/bin/../conf:/usr/lib/jvm/java-6-openjdk/lib/tools.jar:/nutch/search/hadoop-0.20.2-core.jar:/nutch/search/lib/apache-solr-core-1.4.0.jar:/nutch/search/lib/apache-solr-solrj-1.4.0.jar:/nutch/search/lib/commons-beanutils-1.8.0.jar:/nutch/search/lib/commons-cli-1.2.jar:/nutch/search/lib/commons-codec-1.3.jar:/nutch/search/lib/commons-collections-3.2.1.jar:/nutch/search/lib/commons-el-1.0.jar:/nutch/search/lib/commons-httpclient-3.1.jar:/nutch/search/lib/commons-io-1.4.jar:/nutch/search/lib/commons-lang-2.1.jar:/nutch/search/lib/commons-logging-1.0.4.jar:/nutch/search/lib/commons-logging-api-1.0.4.jar:/nutch/search/lib/commons-net-1.4.1.jar:/nutch/search/lib/core-3.1.1.jar:/nutch/search/lib/geronimo-stax-api_1.0_spec-1.0.1.jar:/nutch/search/lib/hadoop-0.20.2-core.jar:/nutch/search/lib/hadoop-0.20.2-tools.jar:/nutch/search/lib/hsqldb-1.8.0.10.jar:/nutch/search/lib/icu4j-4_0_1.jar:/nutch/search/lib/jakarta-oro-2.0.8.jar:/nutch/search/lib/jasper-compiler-5.5.12.jar:/nutch/search/lib/jasper-runtime-5.5.12.jar:/nutch/search/lib/jcl-over-slf4j-1.5.5.jar:/nutch/search/lib/jets3t-0.6.1.jar:/nutch/search/lib/jetty-6.1.14.jar:/nutch/search/lib/jetty-util-6.1.14.jar:/nutch/search/lib/junit-3.8.1.jar:/nutch/search/lib/kfs-0.2.2.jar:/nutch/search/lib/log4j-1.2.15.jar:/nutch/search/lib/lucene-core-3.0.1.jar:/nutch/search/lib/lucene-misc-3.0.1.jar:/nutch/search/lib/oro-2.0.8.jar:/nutch/search/lib/resolver.jar:/nutch/search/lib/serializer.jar:/nutch/search/lib/servlet-api-2.5-6.1.14.jar:/nutch/search/lib/slf4j-api-1.5.5.jar:/nutch/search/lib/slf4j-log4j12-1.4.3.jar:/nutch/search/lib/taglibs-i18n.jar:/nutch/search/lib/tika-core-0.7.jar:/nutch/search/lib/wstx-asl-3.2.7.jar:/nutch/search/lib/xercesImpl.jar:/nutch/search/lib/xml-apis.jar:/nutch/search/lib/xmlenc-0.52.jar:/nutch/search/lib/jsp-2.1/jsp-2.1.jar:/nutch/search/lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.mapred.JobTracker
I've been googling and trying different things for about 8 hours now, and I'm just absolutely stuck as to what might be wrong. I'm sure it's something painfully obvious that I'm overlooking. Does anyone have any idea?
A few more details: this is a three node cluster on EC2, I can ssh w/out a password between each, and the nodes seem to be communicating w/out issue (ie no exceptions in logs). They are all ubuntu 10.04 server. Hadoop 0.20.2.
Thanks in advance.
精彩评论