So let's say that I have:
<any_html_element>mayb开发者_JAVA百科e some whitespaces <br/>Some text</any_html_element>
And I want to remove the first <br/>
after <any_html_element>
.
How can I do that?
Start by not using RegEx, but a HTML parser, to identify the block of code you want to manipulate.
Once you've isolated the actual code, you can then do a replace to remove the <br/>
.
Here are a couple of PHP HTML parser links to investigate:
- http://simplehtmldom.sourceforge.net/
- http://www.phpbuilder.com/manual/en/function.dom-domdocument-loadhtml.php
Search for this regex:
(<any_html_element>.*?)</br>
and replace with:
$1
Turn on single-line mode if there may be line breaks between the two tags. You can do that with /s in PHP.
If with any_html_element you meant that you want to allow any element, use this regex:
(<\w[^<>]+>.*?)</br>
The replacement text remains the same.
While it's true that you can't parse HTML with just one regex, Uffo isn't trying to parse HTML. He just wants to delete one tag. A regex will do that just fine.
精彩评论