As per my understanding of RE
--> *
means matches 0 or more occurrences of prev regex
+
means matches 1 or more occurrences of prev regex
Now lets take a look at the following examples
FIRST:-
% regexp {:+} "DHCP:Enabled" first
1
% puts $first
: 开发者_如何学运维 --> ":" is stored in variable first
%
SECOND:-
% regexp {:*} "DHCP:Enabled" sec
1
% puts $sec
--> Nothing is stored in variable second
%
Why is ":" stored for the FIRST one and not the SECOND?
The second regexp {:*}
matches the empty string because the empty string is 0 occurrences of :
. If you use the -indices
option for regexp
, you'll see that it matches at position 0.
% regexp -indices :* "DHCP:Enabled" indices
1
% puts $indices
0 -1
In other words, the regexp matches at the first character and returns.
It matches the empty string so that it can
match that empty string at the start of “DHCP:Enabled
”. The regular
expression engine like to match things up as soon as possible. To show, here's an interactive session:
% regexp -inline {:*} "DHCP:Enabled"
{}
% regexp -inline -all {:*} "DHCP:Enabled"
{} {} {} {} : {} {} {} {} {} {} {}
% regexp -inline -indices -all {:*} "DHCP:Enabled"
{0 -1} {1 0} {2 1} {3 2} {4 4} {5 4} {6 5} {7 6} {8 7} {9 8} {10 9} {11 10}
The -inline
option is useful for simple testing, the -all
matches in
every matchable location instead of just the first, and the -indices
returns locations rather than the string.
Note that only once (4 4
) is the end at least at the same index as the start; in all other cases, an empty string matches (and it's legal; you said that matching nothing was OK).
In general, it's a really good idea to make sure that your overall RE cannot match the empty string or you'll be surprised by the results.
精彩评论