开发者

RegEx to replace text between dollar signs

开发者 https://www.devze.com 2023-04-10 18:08 出处:网络
I would like to use C# .NET to replace every instance of text between dollar signs. For example: Check out this TeX: $x\\in\\mathbb{Z}^+$. It\'s cool.

I would like to use C# .NET to replace every instance of text between dollar signs. For example:

Check out this TeX: $x\in\mathbb{Z}^+$. It's cool.

.开发者_如何学Python..becomes...

Check out this TeX: <img src="http://chart.googleapis.com/chart?cht=tx&chl=x\in\mathbb{Z}^%2B" alt="x\in\mathbb{Z}^+" />. It's cool.

Note that the formula needs to be URL encoded before it is passed to the Google Charts API.

Please could you show me how to do this using RegEx (or otherwise)?


Here's an example method that will work for you. Note that by using the Regex.Matches method, the method is able to handle multiple matches at once:

public static string AddImgTags(string input)
{
   string pattern = @"\$([^\$]*)\$";

   foreach (Match match in Regex.Matches(input, pattern))
   {
      input = input.Replace(match.Value, 
         string.Format("<img src=\"http://chart.googleapis.com/chart?cht=tx&chl={0}\" alt=\"{0}\" />", 
         HttpUtility.UrlEncode(match.Value)));
   }

   return input;
}

An explanation of the pattern ("\$[^\$]*\$") is as follows:

  • \$ - Matches the beginning $
  • ([^\$]*) - Matches any character except for $, repeated 0 or more times. Also groups the matching characters so that they can be referenced later.
  • \$ - Matches the ending $


You probably want to use the overload of Regex.Replace, that accepts a delegate that computes the replacement:

private string GetCodeForTex(Match match)
{
    string tex = match.Groups[1].Value;
    return string.Format(
        "<img src=\"{0}\" alt=\"{1}\" />", GetEscapedUrlForTex(tex), tex);
}

…

Regex.Replace(textWithDollars, @"\$([^\$]*)\$", GetCodeForTex);

Your code in GetCodeForTex might be different (and you might think of a better name for it), but I'm sure you get the idea.

Also, be careful with simple parsing using regexes like this. It means you can never use $ for anything else than enclosing TeX. And the result will be mess if you have unclosed $ somewhere in the input text.


The general regex would be

 var s = Regex.Replace("test $blabla$! It worked", @"\$.*?\$", "123");

s will become "test 123! It worked"


The other answers would do a simple replace, but they don't grab the group, and place it in your replacement.

So, start with @Donut's Reg Ex, with a tiny change to add a capture group

\$([^\$]*)\$

The other change is to call Regex.Match, so you can grab the text inside the $ using Match.Captures. Then you can run the URL encoding on it to build the Replacement text.

Something like:

var urlTemplate = "<img src="http://chart.googleapis.com/chart?cht=tx&chl={0}" alt="{1}" />";
var matchText = match.Captures[0].Value;

var url = string.Format(urlTemplate, UrlEncode(matchText), matchText);

Since you know exactly what the text is now, you can just do a normal replace for this instance, and then loop to find the rest of the matches

0

精彩评论

暂无评论...
验证码 换一张
取 消