I am trying to do this in python:
cat foo | ssh me@xxxx h开发者_C百科adoop fs -put - bar/foo
I have originally tried a check_call:
foo = 'foo'
subprocess.check_call(['cat', foo, '|','ssh',os.environ['USER']+'@'+hadoopGateway,'hadoop','fs','-put', '-', inputArgs.targetDir+'/'+foo])
which produces the error:
cat: invalid option -- 'p'
I have looked at the python pipes module documentation and played around with it in the shell, but I do not understand how to kick it off without an output file, like the example.
>>> t = pipes.Template()
>>> t.prepend('cat foo', '.-')
>>> t.append('hadoop fs -put - bar/foo', '-.') # what next
Clearly I am missing something.
You don't need cat
or a pipeline for this; all you need is to provide the file as standard input to the ssh
command. In shell, that would be
ssh ${USER}@${hadoopGateway} hadoop fs -put - ${targetDir}/foo < foo
and with the Python subprocess module it's only a tiny bit more involved:
foo='foo'
subprocess.check_call(['ssh',
os.environ['USER']+'@'+hadoopGateway,
'hadoop', 'fs', '-put', '-', inputArgs.targetDir+'/'+foo],
stdin=open(foo, 'r'))
精彩评论