开发者

c# regex matches example

开发者 https://www.devze.com 2023-02-05 20:26 出处:网络
I am trying to get开发者_如何学C values from the following text. How can this be done with Regex?

I am trying to get开发者_如何学C values from the following text. How can this be done with Regex?

Input

Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.

Output

456  
3434  
298   
893434 


So you're trying to grab numeric values that are preceded by the token "%download%#"?

Try this pattern:

(?<=%download%#)\d+

That should work. I don't think # or % are special characters in .NET Regex, but you'll have to either escape the backslash like \\ or use a verbatim string for the whole pattern:

var regex = new Regex(@"(?<=%download%#)\d+");
return regex.Matches(strInput);

Tested here: http://rextester.com/BLYCC16700

NOTE: The lookbehind assertion (?<=...) is important because you don't want to include %download%# in your results, only the numbers after it. However, your example appears to require it before each string you want to capture. The lookbehind group will make sure it's there in the input string, but won't include it in the returned results. More on lookaround assertions here.


All the other responses I see are fine, but C# has support for named groups!

I'd use the following code:

const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

static void Main(string[] args)
{
    Regex expression = new Regex(@"%download%#(?<Identifier>[0-9]*)");
    var results = expression.Matches(input);
    foreach (Match match in results)
    {
        Console.WriteLine(match.Groups["Identifier"].Value);
    }
}

The code that reads: (?<Identifier>[0-9]*) specifies that [0-9]*'s results will be part of a named group that we index as above: match.Groups["Identifier"].Value


public void match2()
{
    string input = "%download%#893434";
    Regex word = new Regex(@"\d+");
    Match m = word.Match(input);
    Console.WriteLine(m.Value);
}


It looks like most of post here described what you need here. However - something you might need more complex behavior - depending on what you're parsing. In your case it might be so that you won't need more complex parsing - but it depends what information you're extracting.

You can use regex groups as field name in class, after which could be written for example like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text.RegularExpressions;

public class Info
{
    public String Identifier;
    public char nextChar;
};

class testRegex {

    const string input = "Lorem ipsum dolor sit %download%#456 amet, consectetur adipiscing %download%#3434 elit. " +
    "Duis non nunc nec mauris feugiat porttitor. Sed tincidunt blandit dui a viverra%download%#298. Aenean dapibus nisl %download%#893434 id nibh auctor vel tempor velit blandit.";

    static void Main(string[] args)
    {
        Regex regex = new Regex(@"%download%#(?<Identifier>[0-9]*)(?<nextChar>.)(?<thisCharIsNotNeeded>.)");
        List<Info> infos = new List<Info>();

        foreach (Match match in regex.Matches(input))
        {
            Info info = new Info();
            for( int i = 1; i < regex.GetGroupNames().Length; i++ )
            {
                String groupName = regex.GetGroupNames()[i];

                FieldInfo fi = info.GetType().GetField(regex.GetGroupNames()[i]);

                if( fi != null ) // Field is non-public or does not exists.
                    fi.SetValue( info, Convert.ChangeType( match.Groups[groupName].Value, fi.FieldType));
            }
            infos.Add(info);
        }

        foreach ( var info in infos )
        {
            Console.WriteLine(info.Identifier + " followed by '" + info.nextChar.ToString() + "'");
        }
    }

};

This mechanism uses C# reflection to set value to class. group name is matched against field name in class instance. Please note that Convert.ChangeType won't accept any kind of garbage.

If you want to add tracking of line / column - you can add extra Regex split for lines, but in order to keep for loop intact - all match patterns must have named groups. (Otherwise column index will be calculated incorrectly)

This will results in following output:

456 followed by ' '
3434 followed by ' '
298 followed by '.'
893434 followed by ' '


Regex regex = new Regex("%download#(\\d+?)%", RegexOptions.SingleLine);
Matches m = regex.Matches(input);

I think will do the trick (not tested).


This pattern should work:

#\d

foreach(var match in System.Text.RegularExpressions.RegEx.Matches(input, "#\d"))
{
    Console.WriteLine(match.Value);
}

(I'm not in front of Visual Studio, but even if that doesn't compile as-is, it should be close enough to tweak into something that works).

0

精彩评论

暂无评论...
验证码 换一张
取 消