I am trying to get a working regular expression to convert standard HTML code to a custom format (needed for data export).
For exemple within the following code :
<a href="toto.php">Toto
</a> bwahaha
<td width="49%" bgcolor="#FF9E39" style="padding-left: 10px; padding-top: 3px; padding-bottom: 3px; border-bottom: 5px solid rgb(255, 255, 255);" class="texteblanc">
<a href="nuit-orientation.php" class="texteblanc">[strong]Nuit de l'orientation[/strong]</a>
</td>
I would like to extract the two links in the following format :
[a:toto.php]Toto[/a]
[a:nuit-orientation.php][strong]Nuit de l'orientation[/strong][/a]
And of course I want the links to be kept in place within the existing HTML code.
So, I tryed the following code :
$txt = preg_replace('/<a href="(([[:word:]]|[[:punct:]])+)"[^>]*>\n*(\r\n)*\r*(([[:print:]]|\r\n|\n)+)\n*(\r\n)*\r*<\/a>/i', '[a:${1}]${4}[/a]', $txt);
It works but not all the time...开发者_JS百科
Does someone have any idea of how to do something like this ?
Thanks,
Damien
Don't use regex to parse HTML!
Use an HTML parser.
精彩评论