I need to run some Pig scripts sequentially in Hadoop. They must be run separately. Any suggestions?
update
Just a quick update that we're working toward running the Pig scripts from one Java class. Oozie is a possibility that was mentioned in a comment (though much too heavy for our needs). I've al开发者_开发问答so heard that it's possible to orchestrate Pig scripts as a part of a larger job flow in Cascading (http://www.cascading.org/) and looked at that a little.
For a simple sequence of tasks I guess what orangeoctopus suggested would probably suffice. If you would like to club together a more complex workflow of pig and/or plain vanilla MapReduce, you should probably take a look at Oozie
Update :
If you are using pig 0.9, you could also possibly take a look at embedding pig in a language like python. Heres the link
In practice, I wrap the majority of my Pig scripts in bash scripts. You could control the sequential execution inside of the shell script:
pig myscript1.pig && pig myscript2.pig && pig myscript3.pig
精彩评论