开发者

What is the purpose of Regex.Escape?

开发者 https://www.devze.com 2023-03-30 23:49 出处:网络
i have acode like below where \'QualifiedInstanceFilter\' isAccessor for the qualified instance filter. Can anybody tell me what logic happening in the line m_afc.QualifiedInstanceFilter 开发者_Go百科

i have a code like below where 'QualifiedInstanceFilter' is Accessor for the qualified instance filter. Can anybody tell me what logic happening in the line m_afc.QualifiedInstanceFilter 开发者_Go百科= "^(" + Regex.Escape(this.Identifier) + ")$"; this is full code

 public override string Identifier
      {
         get
         {
 return string.Format("{0}{1}{2}{3}{4}",
                 Owner.Class,
                 IDSeparator,
                 ManagedClass.Name, IDClassNameSeparator, Instance);

private AlertFilter m_afc = new AlertFilter("", "", true, "", "", "");

    m_afc.QualifiedInstanceFilter = "^(" + Regex.Escape(this.Identifier) + ")$";


Regex.Escape is there to "escape" a string that may contain characters that have special meaning in a Regex. For example (a simple example):

Let's say I wanted to search a string based on user input. One would assume I could write a regex like ".*" + UserInput + ".*". The problem with this is what if the user searched for "$money"? The $ has special meaning in Regex, thus resuling in this Regex: .*$money. - which is incorrect.

If we used Regex.Escape before that, then the $ character would be escaped to avoid that behavior.

You can learn more about it from the documentation.


From MSDN,

Escapes a minimal set of characters (\, *, +, ?, |, {, [, (,), ^, $,., #, and white space) by replacing them with their escape codes. This instructs the regular expression engine to interpret these characters literally rather than as metacharacters.

In your code, it is setting a concatenated string equal to m_afc.QualifiedInstanceFilter and "escaping" whatever is in this.Identifier. If there are any special characters, then they are pre-pended with a \ and treated as non-metacharacters.


The strings that together make up an Identifier are arbitrary strings - they could conceivably include such characters as [ ] * \ and all the other characters that have special meanings within regular expressions. However, the desired effect of QualifiedInstanceFilter is to literally match the Identifier, so if we just said

m_afc.QualifiedInstanceFilter = "^(" + this.Identifier + ")$";

we might end up with a value such as ^(()$(${P${}$*${$}{$)$, which would confuse the regex engine immensely. So we use Regex.Escape to say: "I want to use this string in a regular expression, but I want characters that are normally special to regex to NOT have their special meaning". The regex engine will then escape special characters (by prepending a \), so that when we create a regular epxression by concatentation, the only regex-special characters in it are the ones we put there - the initial ^( and the final )$.


To put it a bit more simply than the other answers:

What do you do if you want to match on a character which is a special character in Regex, e.g. a period "."? You put a backslash in front of it, i.e. you escape it.

Regex.Escape does this for any string you pass it, so it can escape stuff you don't know at compile time. One example would be including user-specified strings in the regex, which may contain special characters.

0

精彩评论

暂无评论...
验证码 换一张
取 消