开发者

Removing everything between a tag (including the tag itself) using Regex / Eclipse

开发者 https://www.devze.com 2022-12-24 08:22 出处:网络
I\'m fairly new to figuring out how Regex works, but this one is just frustrating. I开发者_StackOverflow have a massive XML document with a lot of <description>blahblahblah</description> tag

I'm fairly new to figuring out how Regex works, but this one is just frustrating.

I开发者_StackOverflow have a massive XML document with a lot of <description>blahblahblah</description> tags. I want to basically remove any and all instances of <description></description>.

I'm using Eclipse and have tried a few examples of Regex I've found online, but nothing works.

<description>(.*?)</description>

Shouldn't that work?

EDIT:

Here is the actual code.

<description><![CDATA[<center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"><th>ID</th><td>308</td></tr></table></center>]]></description>


I'm not familiar with Eclipse, but I would expect its regex search facility to use Java's built-in regex flavor. You probably just need to check a box labeled "DOTALL" or "single-line" or something similar, or you can add the corresponding inline modifier to the regex:

(?s)<description>(.*?)</description>

That will allow the . to match newlines, which it doesn't by default.

EDIT: This is assuming there are newlines within the <description> element, which is the only reason I can think of why your regex wouldn't work. I'm also assuming you really are doing a regex search; is that automatic in Eclipse, or do you have to choose between regex and literal searching?

0

精彩评论

暂无评论...
验证码 换一张
取 消