开发者

PHP Regex for information between h4 tags

开发者 https://www.devze.com 2023-03-24 05:24 出处:网络
I am trying to grab what is t开发者_如何学Che h4 text $regex = \'/<h4>([A-Za-z0-9\\,\\.])/\';

I am trying to grab what is t开发者_如何学Che h4 text

    $regex = '/<h4>([A-Za-z0-9\,\.])/';

I am just getting the first letter back, I cannot figure out how to use * to keep grabbing everything to the first < character.

I have made countless attempts and know I am overlooking something simple.

So I was making that much harder than I needed to, the following works:

    $regex = '/<h4>.*?<\/h4>/';


If you can trust that grabbing all characters up to the first < is a good enough rule then use this:

$regex = '/<h4>([^<]*?)</';

Of course that definition will only grab 'The ' from <h4>The <b>Best</b> Book</h4> You can fix that be changing it to:

$regex = '/<h4>(.*?)<\/h4>/';

Which will grab everything between a <h4> and a </h4>, but still isn't perfect because anything like <h4 > or <h4 style="..."> will break it, along with a million other valid HTML examples. If you know that the contents won't have any < though, and you know your tag will always be exactly <h4> the first one works well enough for your situation.

If your situation is more complex you will want to use something like PHP's DOM extension (DOMDocument) which is meant for parsing HTML and XML, since neither are regular languages and cannot be parsed error free with regex.


You can use the below function to accomplish this task.

**function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches;
}** 

In the first parameter you have to pass the complete string, and in the second parameter you have to pass the tagname ("h4")..

0

精彩评论

暂无评论...
验证码 换一张
取 消