开发者

What are some approaches to run multiple Pig scripts sequentially?

开发者 https://www.devze.com 2023-03-22 18:28 出处:网络
I need to run some Pig scripts sequentially in Hadoop.They must be run separately.Any suggestions? update

I need to run some Pig scripts sequentially in Hadoop. They must be run separately. Any suggestions?

update

Just a quick update that we're working toward running the Pig scripts from one Java class. Oozie is a possibility that was mentioned in a comment (though much too heavy for our needs). I've al开发者_开发问答so heard that it's possible to orchestrate Pig scripts as a part of a larger job flow in Cascading (http://www.cascading.org/) and looked at that a little.


For a simple sequence of tasks I guess what orangeoctopus suggested would probably suffice. If you would like to club together a more complex workflow of pig and/or plain vanilla MapReduce, you should probably take a look at Oozie

Update :

If you are using pig 0.9, you could also possibly take a look at embedding pig in a language like python. Heres the link


In practice, I wrap the majority of my Pig scripts in bash scripts. You could control the sequential execution inside of the shell script:

pig myscript1.pig && pig myscript2.pig && pig myscript3.pig

0

精彩评论

暂无评论...
验证码 换一张
取 消