开发者

Regex issue with reserved characters in c#

开发者 https://www.devze.com 2022-12-13 16:02 出处:网络
I\'ve got a working regex that scans a chunk of text for a list of keywords defined in a db. I 开发者_如何转开发dynamically create my regex from the db to get this:

I've got a working regex that scans a chunk of text for a list of keywords defined in a db. I 开发者_如何转开发dynamically create my regex from the db to get this:

\b(?:keywords|from|database|with|esc\@ped|characters|\@ss|gr\@ss)\b

Notice that special characters are escaped. This works for the vast majority of cases, EXCEPT where the first character of the keyword is a regex special character like @ or $. So in the above example, @ss will not be matched, but gr@ss and esc@ped will.

Any ideas how to get this regex to work for these special cases? I've tried both with and without escaping the special characters in the regex string, but to no avail.

Thanks in advance,

David


new Regex(@"(?<=^|\W)(?:keywords|from|database|with|esc@ped|characters|@ss|gr@ss)(?=\W|$)")

will match. It checks whether there is a non-word character (or beginning/end of string) before/after the keyword to be matched. I chose \W over \s because of punctuation and other non-word characters that might constitute a word boundary.

Edit: Even better (thanks to Alan Moore! - both versions will produce the same results):

new Regex(@"(?<!\w)(?:keywords|from|database|with|esc@ped|characters|@ss|gr@ss)(?!\w)")

Both will fail to match @ass in l@ss which is probably what you want.


When you get the keywords from the database, escape them with Regex.Escape before creating the Regex string.


The @ does not denote a word boundary.

Use: (\s|^)(?:keywords|from|database|with|esc@ped|characters|@ss|gr@ss)(\s|$)

Tested with the following program:

    static void Main(string[] args)
    {
        string pattern = "(\\s|^)(?:keywords|from|database|with|esc@ped|characters|@ss|gr@ss)(\\s|$)"
        var matches = Regex.Matches("@ss is gr@ss is esc@ped keywordsnospace keywords", pattern);
        foreach (Match match in matches)
        {
            Console.WriteLine(match.Groups[2]);
        }
    }

Giving the result:

@ss

gr@ss

esc@ped

keywords

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号