I'm using ASP.Net 4.0 with MVC 2. I'm recieving user content that may or may not 开发者_StackOverflow中文版be Html Encoded already. I've read http://weblogs.asp.net/scottgu/archive/2010/04/06/new-lt-gt-syntax-for-html-encoding-output-in-asp-net-4-and-asp-net-mvc-2.aspx which was interesting but what I need is a way to ensure the content is encoded without double encoding. I don't have control of the input process.
E.g.
User Input:
& < < > >
Output if encoded:
&amp; &lt; < > &gt;
Won't display correctly
Output if not encoded:
& < < > >
This won't validate correctly
You could make a first pass decoding user input, and then re-encode the result. This way, if some values of the input are already encoded, they will get decoded, and you'll be able to encode everything after.
& < < > >
-> decode the input and you get:
& < < > >
-> re-encode everything and you get:
& < < > >
If it were me, I'd replace only the <
and >
characters, leaving everything else intact.
I don't think you will find a solution which will work automatically both for content that is encoded and not - the only way I can see where you can do this reliably is to specify whether the content has been encoded or not. Otherwise, you will run into problems in certain situations, e.g.
Some plain text mentioning > being the syntax for >
And
<p>Some HTML mentioning that &amp; is the syntax for ></p>
You can try detecting whether there is encoded content or HTML content present, but my examples above show that this will not always be infallible.
精彩评论