开发者

Count all HTML tags in page PHP

开发者 https://www.devze.com 2023-01-06 00:53 出处:网络
I spent time on regex to solve this problem but not have result i try solve this problem using PHP 5.3

I spent time on regex to solve this problem but not have result i try solve this problem using PHP 5.3 Information like - How many times repeats in page and information a开发者_如何学Pythonbout all tags in page.


Your question is unfortunately barely understandable in it's current form. Please try to update it and be more specific. If you want to count all HTML tags in a page, you can do:

$HTML = <<< HTML
<html>
    <head>
        <title>Some Text</title>
    </head>
    <body>
        <p>Hello World<br/>
            <img src="earth.jpg" alt="picture of earth from space"/>
        <p>
        <p>Counting Elements is easy with DOM</p>
    </body>
</html>
HTML;

Counting all DOMElements with DOM:

$dom = new DOMDocument;
$dom->loadHTML($HTML);
$allElements = $dom->getElementsByTagName('*');
echo $allElements->length;

The above will output 8, because there is eight elements in the DOM. If you also need to know the distribution of elements, you can do

$elementDistribution = array();
foreach($allElements as $element) {
    if(array_key_exists($element->tagName, $elementDistribution)) {
        $elementDistribution[$element->tagName] += 1;
    } else {
        $elementDistribution[$element->tagName] = 1;
    }
}
print_r($elementDistribution);

This would return

Array (
    [html] => 1
    [head] => 1
    [title] => 1
    [body] => 1
    [p] => 2
    [br] => 1
    [img] => 1
)

Note that getElementsByTagName returns DOMElements only. It does not take into account closing tags, nor does it return other DOMNodes. If you also need to count closing tags and other node types, consider using XMLReader instead.


$testHTML = file_get_contents('index.html');

$search = preg_match_all('/<([^\/!][a-z1-9]*)/i',$testHTML,$matches);

echo '<pre>';
var_dump($matches[1]);
echo '</pre>';

Gives you an array of all the tags. Once the data is in the array, you can use all the standard PHP array functions - e.g. array_count_values() - to extract the details you want... though you're not really saying what information you want about the html tags

Using array_count_values() with the results of the preg_match_all():

echo '<pre>';
var_dump(array_count_values($matches[1]));
echo '</pre>';

gives

array(5) {
  ["html"]=>
  int(1)
  ["head"]=>
  int(1)
  ["title"]=>
  int(1)
  ["body"]=>
  int(1)
  ["h1"]=>
  int(2)
}

Is this what you want?


I suggest you checkout simple html dom

http://simplehtmldom.sourceforge.net/manual.htm

0

精彩评论

暂无评论...
验证码 换一张
取 消