开发者

PHP scrape title, IF

开发者 https://www.devze.com 2023-02-18 08:32 出处:网络
Basically I a开发者_JS百科m wanting to get the title of pages, I want it to return TRUE if title is like:

Basically I a开发者_JS百科m wanting to get the title of pages,

I want it to return TRUE if title is like:

<title>
Site Name - Page</title>

but return false if title is like:

<title>
Site Name - </title>

How can I go about inputting a URL into an fopen, checking the title and then returning TRUE/FALSE depending on the title, we only want it to be TRUE if there is text after the "-" in the title tag.

Here is the code I am currently working with:

while ($r = mysql_fetch_array($q)){
    $url = "http://www.sitename/" . strtolower($r['z'] . "." . $r['x']) . "/";
    $file = fopen(($url),"r") or die ("Can't read input stream");
    $text = fread($file,32768);
    if (preg_match('/<title>(.*?)<\/title>/is',$text,$found)) {
            $title = 1;
    } else {
            $title = 0;
    }
    fclose($file);
}


I haven't verified your code for opening the URL, but I do see that your regex could be improved upon. Try this...

/<title>.+\s-\s.+<\/title>/is

where

.+ ensures there is atleast on character before and after the dash, and
\s-\s ensures that there is a " - " separating the first and second part of the title tag.


I would wrap the title check in a function like this:

function check_title($url){
  $html = file_get_contents($url);
  return (preg_match("/\<title\>(.+)-(.+)\<\/title\>/i", $html))? TRUE: FALSE;
}

and you could use it like this:

while ($r = mysql_fetch_array($q)){
  $url = "http://www.sitename/" . strtolower($r['z'] . "." . $r['x']) . "/";
  $title = check_title($url);
}
0

精彩评论

暂无评论...
验证码 换一张
取 消