I have an application that allows the user to cre开发者_如何学Pythonate an article. The problem arises when the user pastes from something like Word which comes loaded with a bunch of markup.
I'm using a jQuery editor called tiny_mce which allows the markup. I do a htmlencode and decode obviously but it means that i carry a huge payload of markup.
Is there a way to strip (all) markup from pasted text and just keep the text?
Or is there a way that tiny_mce can show the markup as text?
It's been a while since I used tinyMCE, but when I did I used this paste plugin that did automatic clean-up on paste, including paste from Word.
Strip all HTML markup using Regex: http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
string stripped = Regex.Replace(textBox1.Text,@"<(.|\n)*?>",string.Empty);
This Regex expression can be applied to the language of choice.
I use a simple Windows shell addin caled Pure Text. It overloads the Windows+V key to do a plain text paste.
精彩评论