i got a weird problem regarding egrep and pipe
I tried to filter a stream containing some lines who start with a topic name, such as "TICK:this is a tick message\n"
When I try to use egrep to filter it : ./stream_generator | egrep 'TICK' | ./topic_pr开发者_运维知识库ocessor It seems that the topic_processor never receives any messages
However, when i use the following python script: ./stream_generator | python filter.py --topics TICK | ./topic_processor everything looks to be fine.
I guess there need to be a 'flush' mechanism for egrep as well, is this correct?
Can anyone here give me a clue? Thanks a million
import sys
from optparse import OptionParser
if __name__ == '__main__':
parser = OptionParser()
parser.add_option("-m", "--topics",
action="store", type="string", dest="topics")
(opts, args) = parser.parse_args()
topics = opts.topics.split(':')
while True:
s = sys.stdin.readline()
for each in topics:
if s[0:4] == each:
sys.stdout.write(s)
sys.stdout.flush()
Have you allowed the command ./stream_generator | egrep 'TICK' | ./topic_processor
to run to completion? If the command has completed without producing output then the problem does not lie with buffering since, upon the termination of ./stream_generator
, egrep
will flush any of its buffers and in turn terminate.
Now, it is true that egrep
will use heavy buffering when not outputting directly to a terminal (i.e. when outputting to a pipe or file), and it may appear for a while that egrep
produces no output if not enough data has accumulated in egrep
's buffer to warrant a flush. This behaviour can be changed in GNU egrep
by using the --line-buffered
option:
./stream_generator | egrep --line-buffered 'TICK' | ./topic_processor
精彩评论