I need to strip link tags from a body of text but keep the anchor text. for example:
<a href ="">AnchorText</a>
needs to become just:
AnchorText
I was considering using the following RegEx:
<(.{0}|/)(a|A).*?>
Is a RegEx the best way to go about thi开发者_如何学JAVAs? If so, is the above RegEx pattern adequate? If RegEx isn't the way to go, what's a better solution? This needs to be done server side.
Your regex will do the job. You can write it a bit simpler as
</?(a|A).*?>
/?
means 0 or 1 /
But its equivalent to your (.{0}|/)
You could just use HtmlAgilityPack:
string sampleHtml = "<a href =\"\">AnchorText</a>";
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(sampleHtml);
string text = doc.DocumentNode.InnerText; //output: AnchorText
I think a regex is the best way to accomplish this, and your pattern looks like it should work.
Use jQuery replaceWith:
$('a').replaceWith(function()
{
return $('<span/>').text($(this).text());
});
Assuming you are doing this on the client side.
I have been trying to do the same and found the following solution:
- Export the text to CSV.
- Open the file in Excel.
- Run replace using <*> which will remove links and leave the anchor text.
- Import the result again to overwrite existing content.
精彩评论