Why do I have to type ctrl-d twice? [duplicate]_问答_开发者

This question already has answers here: Why do I have to press Ctrl+D twice to close stdi开发者_运维问答n? (5 answers) Closed 9 years ago.

For my own amusement, I've cooked up a python script that allows me to use python for bash one-liners; Supply a python generator expression; and the script iterates over it. Here's the script:

DEFAULT_MODULES = ['os', 're', 'sys']

_g = {}
for m in DEFAULT_MODULES:
    _g[m] = __import__(m)

import sys
sys.stdout.writelines(eval(sys.argv[1], _g))

And here's how you might use it.

$ groups | python pype.py '(l.upper() for l in sys.stdin)'
DBORNSIDE
$

For the intended use, it works perfectly!

But when I don't feed it with pipe and just invoke it directly, for instance: _{[emphasis added to show what I type]}

$ python pype.py '("%r\n" % (l,) for l in sys.stdin)'
fooEnter
barEnter
bazEnter
Ctrl DCtrl D'foo\n'
'bar\n'
'baz\n'
$

In order to stop accepting input and produce any output, I have to type either Enter - Ctrl D - Ctrl D or Ctrl D - Ctrl D - Ctrl D. This violates my expectations, that each line should be processed as entered, and that typing Ctrl D at any time will end the script. Where is the gap in my understanding?

EDIT: I've updated the interactive example to show that I'm not seeing the quoting wim describes in his answer, and some more examples too.

$ python pype.py '("%r\n" % (l,) for l in sys.stdin)'
fooCtrl DCtrl DbarEnter
Ctrl DCtrl D'foobar\n'
$ python pype.py '("%r\n" % (l,) for l in sys.stdin)'
fooCtrl VCtrl D^DbarEnter
Ctrl DCtrl D'foo\x04bar\n'
$

Ctrl-D is recognized not necessarily as EOF, but as "terminate current read() call".

If you have an empty line (or just pressed Ctrl-D) and press Ctrl-D, your read() terminates immediately and returns 0 read bytes. And this is a sign for EOF.

If you have data in a line and press Ctrl-D, your read() terminates with whatever there has been typed, of course without a terminating newline ('\n').

So if you have input data, you press Ctrl-D twice of a non-empty line or once on a empty one, i.e. with Enter before.

This all holds for the normal OS interface, accessible from Python via os.read().

Python file objects, and also file iterators, treat the first EOF recognized as termination for the current read() call, as they suppose there is nothing any longer. A next read() call tries again and needs another Ctrl-D in order to really return 0 bytes. The reason is that a file object read() always tries to return as many bytes as requested and tries to fill up if a OS read() returns less than requested.

As opposite to file.readline(), iter(file) uses the internal read() functions to read and thus always has this special requirement of the extra Ctrl-D.

I always use iter(file.readline, '') to read line-wise from a file.

Ctrl+D is recognized by the terminal device, terminal responds to it by generating an end of file. Perhaps this will help, from Wikipedia (emphasis mine):

In UNIX and AmigaDOS, the translation of the keystroke to EOF is performed by the terminal driver, so a program does not need to distinguish terminals from other input files. By default, the driver converts a Control-D character at the start of a line into an end-of-file indicator. To insert an actual Control-D (ASCII 04) character into the input stream, the user precedes it with a "quote" command character (usually Control-V, although on some systems you achieve this effect by typing Control-D twice).

I can't say exactly why the extra CTRL+D (the other answer does a very good job of that though), but this will make it so the input is printed after only a single CTRL+D, but you still need to CTRL+D a second time to exit the script

#!/usr/bin/python
DEFAULT_MODULES = ['os', 're', 'sys']

_g = {}
for m in DEFAULT_MODULES:
    _g[m] = __import__(m)

import sys
for x in eval(sys.argv[1], _g):
    print x,

Output:

[ root@host ~ ]$ ./test.py '(l.upper() for l in sys.stdin)'
abc
def(ENTER, CTRL+D)
ABC
DEF
qwerty(ENTER, CTRL+D)
QWERTY
[ root@host ~ ]$

Edit:

eval is returning a generator in this case, so possible the first EOF (CTRL+D) ends the reading of sys.stdin, and the second stops the generator that eval is producing.

Generator - A function which returns an iterator. It looks like a normal function except that it contains yield statements for producing a series a values usable in a for-loop or that can be retrieved one at a time with the next() function. Each yield temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator resumes, it picks-up where it left-off (in contrast to functions which start fresh on every invocation).

Generator Class reference (section 9.10)