I have a string variable that contains a lot of HTML markup and I want to get the last <li>
element from it.
Im using something like:
$markup = "<body><div><li id='first'>One</li><li id='second'>Two</li><li id='third'>Three</li></div></body>";
preg_match('#<li(.*?)>(.*)</li>#ims', $markup, $matches);
$lis = "<li ".$matches[1].">".$matches[2]."</li>";
$total = explode("</li>",$lis);
$num = count($total)-2;
echo $total[$num]."</li>";
This works and I get the last <li>
element printed. But I cant understand why I have to subtract the last 2 indexes of the array $total
. Normally I would only subtract the last index since counting starts on index 0. What im i missing?
Is there a better way of getting the last <li>
开发者_开发知识库element from the string?
HTML is not regular, and so can't be parsed with a regular expression. Use a proper HTML parser.
@OP, your requirement looks simple, so no need for parsers or regex.
$markup = "<body><div><li id='first'>One</li><li id='second'>Two</li><li id='third'>Three</li></div></body>";
$s = explode("</li>",$markup,-1);
$t = explode(">",end($s));
print end($t);
output
$ php test.php
Three
If you already know how to use jQuery, you could also take a look at phpQuery. It's a PHP library that allows you to easily access dom elements, just like in jQuery.
From the PHP.net documentation:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.
$matches[0] is the complete match (not just the captured bits)
You have to extract the second index because you have 2 capturing groupds:
$matches[0]; // Contains your original string
$matches[1]; // Contains the argument for the LI start-tag (.*?)
$matches[2]; // Contains the string contained by the LI tags (.*)
'parsing' (x)html strings is with regular expressions is hard and can be full of unexpected problems. parsing more than simple tagged strings is not possible because (x)html is not a regular language.
you could improve your regex by using (not tested):
/#<li([^>]*)>(.+?)</li>#ims/
strrpos — Find position of last occurrence of a char in a string
精彩评论