开发者

Regex to fix GetSafeHtmlFragment x_ prefix

开发者 https://www.devze.com 2023-03-21 07:52 出处:网络
When using Sanitizer.GetSafeHtmlFragment from Microsoft\'s AntiXSSLibrary 4.0, I noticed it changes my HTML fragment from:

When using Sanitizer.GetSafeHtmlFragment from Microsoft's AntiXSSLibrary 4.0, I noticed it changes my HTML fragment from:

<pre class="brush: csharp">
</pre>

to:

<pre class="x_brush: x_csharp">
</pre>

Sadly their API doesn't allow us to disable this behavior. Therefore I'd like to use a regular expr开发者_如何转开发ession (C#) to fix and replace strings like "x_anything" to "anything", that occur inside a class="" attribute.

Can anyone help me with the RegEx to do this?

Thanks

UPDATE - this worked for me:

 private string FixGetSafeHtmlFragment(string html)
        {
            string input = html;
            Match match = Regex.Match(input, "class=\"(x_).+\"", RegexOptions.IgnoreCase);

            if (match.Success)
            {
                string key = match.Groups[1].Value;
                return input.Replace(key, "");
            }
            return html;
        }


Im not 100% sure about the C# @(Verbatim symbol) but I think this should match x_ inside of any class="" and replace it with an empty string:

string input = 'class="x_something"';
Match match = Regex.Match(input, @'class="(x_).+"',
    RegexOptions.IgnoreCase);

if (match.Success)
{
    string key = match.Groups[1].Value;
    string v = input.Replace(key,"");
}


It's been over a year since this has been posted but here's some regex you can use that will remove up to three class instances. I'm sure there's a cleaner way but it gets the job done.

VB.Net Code:

Regex.Replace(myHtml, "(<\w+\b[^>]*?\b)(class="")x[_]([a-zA-Z]*)( )?(?:x[_])?([a-zA-Z]*)?( )?(?:x[_])?([^""]*"")", "$1$2$3$4$5$6$7")
0

精彩评论

暂无评论...
验证码 换一张
取 消