开发者

Captures count is always zero

开发者 https://www.devze.com 2023-03-08 20:11 出处:网络
I\'ve got a problem. I use following regular expression: Pattern = (?\'name\'\\w+(?:\\w|\\s)*), \\s* (?\'category\'\\w+(?:\\w|\\s)*), \\s*

I've got a problem. I use following regular expression:


Pattern =
  (?'name'\w+(?:\w|\s)*), \s*
  (?'category'\w+(?:\w|\s)*), \s*
  (?:
      \{ \s*
          [yY]: (?'year'\d+), \s*
          [vV]: (?'volume'(?:([1-9][0-9]*\.?[0-9]*)|(\.[0-9]+))+), \s*
      \} \s*
      ,? \s*
  )*

with IgnorePatternWhitespaces option. Everything seemed fine in my application until I debugged it & encountered a problem.


var Year = default(UInt32);
// ...
if((Match = Regex.Match(Line, Pattern, Options)).Success)
{
    // Getting Product header informatio开发者_如何学Cn
    Name = Match.Groups["name"].Value;

    // Gathering Product statistics
    for(var ix = default(Int32); ix < Match.Groups["year"].Captures.Count; ix++)
    {
       // never get here
       Year = UInt32.Parse(Match.Groups["year"].Captures[ix].Value, NumberType, Culture);
    }
}

So in the code above.. In my case Match is always successful. I get proper value for Name but when turn comes to for loop program flow just passes it by. I debugged there's no Captures in Match.Groups["year"]. So it is logical behavior. But not obvious to me where I'm wrong. Help!!

There is a previous connected post Extract number values enclosed inside curly brackets I made.

Thanks!

EDIT. Input Samples

Sherwood, reciever, {y:2008,V:5528.35}, {y:2009,V:8653.89}, {y:2010, V:4290.51}
  • I need to capture 2008, 5528.35, 2009, 8653.89, 2010, 4290.51 values and operate with them as named groups.

2D EDIT

I tried using ExplicitCapture Option and following expression:

(?<name>\w+(w\| )*), (?<category>\w+(w\| )*), (\{[yY]:(?<year>\d+), *[vV]:(?<volume>(([1-9][0-9]*\.?[0-9]*)|(\.[0-9]+))+)\}(, )?)+

But that didn't help.


Edit: You could simplify by matching everything until the next comma: [^,]*. Here's a full code snippet to match your source data:

var testRegex = new Regex(@"
    (?'name'[^,]*),\s*
    (?'category'[^,]*),\s*
    ({y:(?'year'[^,]*),\s*
    V:(?'volume'[^,]*),?\s*)*",
    RegexOptions.IgnorePatternWhitespace);
var testMatches = testRegex.Matches(
    "Sherwood, reciev, {y:2008,V:5528.35}, {y:2009,V:8653.89}, {y:2010, V:4290.51}");
foreach (Match testMatch in testMatches)
{
    Console.WriteLine("Name = {0}", testMatch.Groups["name"].Value);
    foreach (var capture in testMatch.Groups["year"].Captures)
        Console.WriteLine("    Year = {0}", capture);
}

This prints:

Name = Sherwood
    Year = 2008
    Year = 2009
    Year = 2010


I think the problem is a comma:

, \s* \}

which should be optional (or omitted?):

,? \s* \}


To expound on what MRAB said:

(?'name'
    \w+
    (?:
       \w|\s
    )*
),
\s* 
(?'category'
     \w+
     (?:
         \w|\s
     )*
),
\s* 
(?:
      \{ 
          \s*
          [yY]:
          (?'year'
               \d+
          ),
          \s*
          [vV]:
          (?'volume'
               (?:
                   (     # Why do you need capturing parenth's here ?
                     [1-9][0-9]*
                     \.?
                     [0-9]*
                   )
                 |
                   (
                     \.[0-9]+
                   )
               )+
          ),        # I'm just guessing this comma doesent match input samples
          \s*
      \}
      \s*
      ,?
      \s*
)*


Sherwood, reciever, {y:2008,V:5528.35}, {y:2009,V:8653.89}, {y:2010, V:4290.51}
0

精彩评论

暂无评论...
验证码 换一张
取 消