开发者

TCL Regular Expression Doubt

开发者 https://www.devze.com 2023-01-09 10:48 出处:网络
As per my understanding of RE --> * means matches 0 or more occurrences of prev regex --> + means matches 1 or more occurrences of prev regex

As per my understanding of RE

--> * means matches 0 or more occurrences of prev regex

--> + means matches 1 or more occurrences of prev regex

Now lets take a look at the following examples

FIRST:-

% regexp {:+} "DHCP:Enabled" first
1
% puts $first
:       开发者_如何学运维              --> ":" is stored in variable first
%

SECOND:-

% regexp {:*} "DHCP:Enabled" sec
1
% puts $sec
                     --> Nothing is stored in variable second
%

Why is ":" stored for the FIRST one and not the SECOND?


The second regexp {:*} matches the empty string because the empty string is 0 occurrences of :. If you use the -indices option for regexp, you'll see that it matches at position 0.

 % regexp -indices :* "DHCP:Enabled" indices
 1
 % puts $indices
 0 -1

In other words, the regexp matches at the first character and returns.


It matches the empty string so that it can match that empty string at the start of “DHCP:Enabled”. The regular expression engine like to match things up as soon as possible. To show, here's an interactive session:

% regexp -inline {:*} "DHCP:Enabled"
{}
% regexp -inline -all {:*} "DHCP:Enabled"
{} {} {} {} : {} {} {} {} {} {} {}
% regexp -inline -indices -all {:*} "DHCP:Enabled"
{0 -1} {1 0} {2 1} {3 2} {4 4} {5 4} {6 5} {7 6} {8 7} {9 8} {10 9} {11 10}

The -inline option is useful for simple testing, the -all matches in every matchable location instead of just the first, and the -indices returns locations rather than the string.

Note that only once (4 4) is the end at least at the same index as the start; in all other cases, an empty string matches (and it's legal; you said that matching nothing was OK).

In general, it's a really good idea to make sure that your overall RE cannot match the empty string or you'll be surprised by the results.

0

精彩评论

暂无评论...
验证码 换一张
取 消