Why does this sed line do this?_问答_开发者_运维开发者技术经验分享

Why does this produce these results?

C:\>(echo a  &&  echo b) | sed "1!G;h;$p"
a
b
a
b
a

C:\>

Added-

I see now there is no question, that it'd give those results.. but

(added note- 2nd line = last line. But I see you wrote last line to emphasise that it's $ matching the 2nd line as last line. I accept that notation. Also, 1s开发者_如何转开发t line, 2nd line, last line, refer to lines of input.)

Jonathan, you wrote

a - 1st line, pattern space

b - 2nd line, pattern space, line 1

a - 2nd line, pattern space, line 2

b - last line, $p pattern space, line 1

a - last line, $p pattern space, line 2

But wouldn't it be

Note- Dennis's comment has confirmed that the "But wouldn't it be" is correct

a - 1st line, pattern space

b - last line, $p pattern space, line 1

a - last line, $p pattern space, line 2

b - 2nd line, pattern space, line 1

a - 2nd line, pattern space, line 2

i.e. the same output

But the descriptions of how it did the b a, is the other way around

That's assuming the $ operates on the last line, and not after it..

But what you wrote makes it look like the $ operates after it.

1!G appends the hold space to the pattern space after the first line.
h copies the pattern space to the hold space (on every line).
$p prints the last line (again).

The first input line is 'a'; the G command is ignored; the line is copied to the hold space; the line is printed (because you didn't say -n).

The second input line is 'b'; the G command appends the hold space ('a') to the pattern space ('b'); the 'h' command copies the pattern space to the hold space; the pattern space is printed once because of the 'no -n'.

There is no more input, so the $p acts and prints the pattern space.

So, you get:

a - 1st line, pattern space
b - 2nd line, pattern space, line 1
a - 2nd line, pattern space, line 2
b - last line, $p pattern space, line 1
a - last line, $p pattern space, line 2

The question is asked: did I tag the pairs of 'b/a' lines backwards?

Good question: not sure...how would we find out?

Let's add another operation to the '$' set:

(echo a; echo b) | sed -e '1!G;h;$p;$s/b\na/X'

The output is:

a
b
a
X

which shows that the $p print does occur before the $s/// operation, which occurs before the final print of the pattern space.

One side-effect of this observation, which caught me by surprise (but makes sense on reflection), is that sed knows when it is processing the last line as it is running the script on the last line. That means it does read-ahead after the newline to see whether there is more data to fetch. Dennis Williamson shows that the sed source at about line 928 contains a function test_eof() which does indeed do one character of lookahead.

(One of the good things about SO is that you learn even as you teach!)

So, somewhat to my surprise, it seems that sed knows when it has reached the last line before it processes it - so it seems to do some sort of read-ahead. Either that or I've misunderstood something really badly, or it's too late and I need to go to sleep.

first line of input is 'a' second line of input is 'b'

Sed reads the first line of input(a) into the pattern space and runs through the commands. Of the commands, of which there are three. 1!G is skipped as that only applies(appending the pattern space to the holding space) when it's not the first line. h copies the pattern space to the holding space. $p doesn't apply. Then after running through the commands it prints the pattern space. It prints 'a'. Note the holding space is still 'a' and will be even just after the next line is read into the pattern space. The pattern space is currently 'a' but will change -being overwritten- when the next line is read in.

Sed reads the second line of input(b) into the pattern space, note the holding space is still 'a'. It then runs 1!G appending the holding space(a) to the pattern space(b), giving a pattern space of 'b\na' and a holding space of 'a'. The next command is 'h', which copies the pattern space (b\na) into the holding space, giving a holding space of 'b\na' though we see the holding space here isn't that relevant. The next command is $p which applies since the second line of input (which is the line of input we are on) is also the last line of input, and so as mentioned, $p applies and prints the pattern space so we get b\na printed. Then the pattern space gets printed again, because as with the first line, after all commands are run, the pattern space is printed. So reading the second line of input (which is the last line of input), we get b\na and again b\na. The first occurrence of b\na is from the $p, the second occurrence is from the printing of the pattern space that occurs after all commands are run.

a  <------- Reading first line of input. Printing the pattern space that happens after the end of the commands.
b\na     <---- Reading second line of input(which is the last line of input), hitting the $p command. That command didn't apply in the case of the first line of input.
b\na  <--- Still on the second line of input(which is the last line of input), printing the pattern space that happens after the end of the  commands.

(note, one might ask.. should it be a\n, and b\na\n. i.e. shouldn't there be a new line character at the end of the pattern space since they are lines.. and *nix uses \n as line terminator rather than line separator. Maybe.. But I think in the case of the sed pattern space it maybe doesn't have a \n at the end. I read in the sed manual that the append commands work such that, if they are to append 'text' to the pattern space or holding space, they append a \n followed by that 'text'. So that would suggest the pattern space doesn't end with a \n but i'm no expert)