For my own amusement, I've cooked up a python script that allows me to use python for bash one-liners; Supply a python generator expression; and the script iterates over it. Here's the script:
DEFAULT_MODULES = ['os', 're', 'sys']
_g = {}
for m in DEFAULT_MODULES:
_g[m] = __import__(m)
import sys
sys.stdout.writelines(eval(sys.argv[1], _g))
And here's how you might use it.
$ groups | python pype.py '(l.upper() for l in sys.stdin)'
DBORNSIDE
$
For the intended use, it works perfectly!
But when I don't feed it with pipe and just invoke it directly, for instance: [emphasis added to show what I type]
$ python pype.py '("%r\n" % (l,) for l in sys.stdin)' fooEnter barEnter bazEnter Ctrl DCtrl D'foo\n' 'bar\n' 'baz\n' $
In order to stop accepting input and produce any output, I have to type either Enter - Ctrl D - Ctrl D or Ctrl D - Ctrl D - Ctrl D. This violates my expectations, that each line should be processed as entered, and that typing Ctrl D at any time will end the script. Where is the gap in my understanding?
EDIT: I've updated the interactive example to show that I'm not seeing the quoting wim describes in his answer, and some more examples too.
$ python pype.py '("%r\n" % (l,) for l in sys.stdin)' fooCtrl DCtrl DbarEnter Ctrl DCtrl D'foobar\n' $ python pype.py '("%r\n" % (l,) for l in sys.stdin)' fooCtrl VCtrl D^DbarEnter Ctrl DCtrl D'foo\x04bar\n' $
Ctrl-D is recognized not necessarily as EOF, but as "terminate current read()
call".
If you have an empty line (or just pressed Ctrl-D) and press Ctrl-D, your read()
terminates immediately and returns 0 read bytes. And this is a sign for EOF.
If you have data in a line and press Ctrl-D, your read()
terminates with whatever there has been typed, of course without a terminating newline ('\n'
).
So if you have input data, you press Ctrl-D twice of a non-empty line or once on a empty one, i.e. with Enter before.
This all holds for the normal OS interface, accessible from Python via os.read()
.
Python file objects, and also file iterators, treat the first EOF recognized as termination for the current read()
call, as they suppose there is nothing any longer. A next read()
call tries again and needs another Ctrl-D in order to really return 0 bytes. The reason is that a file object read()
always tries to return as many bytes as requested and tries to fill up if a OS read()
returns less than requested.
As opposite to file.readline()
, iter(file)
uses the internal read()
functions to read and thus always has this special requirement of the extra Ctrl-D.
I always use iter(file.readline, '')
to read line-wise from a file.
Ctrl+D is recognized by the terminal device, terminal responds to it by generating an end of file. Perhaps this will help, from Wikipedia (emphasis mine):
In UNIX and AmigaDOS, the translation of the keystroke to EOF is performed by the terminal driver, so a program does not need to distinguish terminals from other input files. By default, the driver converts a Control-D character at the start of a line into an end-of-file indicator. To insert an actual Control-D (ASCII 04) character into the input stream, the user precedes it with a "quote" command character (usually Control-V, although on some systems you achieve this effect by typing Control-D twice).
I can't say exactly why the extra CTRL+D (the other answer does a very good job of that though), but this will make it so the input is printed after only a single CTRL+D, but you still need to CTRL+D a second time to exit the script
#!/usr/bin/python
DEFAULT_MODULES = ['os', 're', 'sys']
_g = {}
for m in DEFAULT_MODULES:
_g[m] = __import__(m)
import sys
for x in eval(sys.argv[1], _g):
print x,
Output:
[ root@host ~ ]$ ./test.py '(l.upper() for l in sys.stdin)'
abc
def(ENTER, CTRL+D)
ABC
DEF
qwerty(ENTER, CTRL+D)
QWERTY
[ root@host ~ ]$
Edit:
eval
is returning a generator in this case, so possible the first EOF (CTRL+D) ends the reading of sys.stdin, and the second stops the generator that eval
is producing.
Generator - A function which returns an iterator. It looks like a normal function except that it contains yield statements for producing a series a values usable in a for-loop or that can be retrieved one at a time with the next() function. Each yield temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator resumes, it picks-up where it left-off (in contrast to functions which start fresh on every invocation).
Generator Class reference (section 9.10)
精彩评论