开发者

Bash Script Command Issue

开发者 https://www.devze.com 2023-01-28 00:40 出处:网络
I when I type the following command into cygwin: bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*

I when I type the following command into cygwin:

bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/* 

then the binary works fine. When I place the exact same line into my bash script:

#!/bin/bash/
bin/nutch index crawl/crawldb crawl/linkdb crawl/segment/*

I get an error saying some files don't exist. This may be specific to Nutch which is the program I'm running, but I think it has more to do with how I'm calling the command in the script. Any ideas about what's wrong and how to fix this? (yes I'm using tab completion)

EDIT:

Script:

#!/bin/bash
/home/Dan/apache-nutch-1.2/bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*

I run the command:

$ pwd
/home/Dan/apache-nutch-1.2
$ ./nutch.sh

The output I'm getting is:

Indexer: starting at 2010-11-29 15:15:44
Indexer: org.apache.hadoop.mapred.InvalidInputException: 开发者_如何学编程Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_fetch
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/crawl_parse
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_data
Input path does not exist: file:/C:/cygwin/home/Dan/apache-nutch-1.2/
/parse_text
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
    at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:201)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
    at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
    at org.apache.nutch.indexer.Indexer.run(Indexer.java:97)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.indexer.Indexer.main(Indexer.java:106)

Regards, ~DS


Two things:

  1. You've got a trailing slash after "bash" in the shebang at the start of the script -- remove it, it should just read #!/bin/bash. Also double check there is a bash in /bin.
  2. The script will try and execute nutch from the bin directory in your currect folder. So if you're in $HOME, and assuming you've got a path $HOME/bin/nutch, then you'll be okay. But then if you change to /tmp, then it'll fail as there's no such path as /tmp/bin/nutch. You're better off giving the full absolute path name to nutch in the first place.
0

精彩评论

暂无评论...
验证码 换一张
取 消