开发者

php preg_replace, regexp

开发者 https://www.devze.com 2022-12-30 04:50 出处:网络
I\'m trying to extract the postal codes 开发者_StackOverflowfrom yell.com using php and preg_replace.

I'm trying to extract the postal codes 开发者_StackOverflowfrom yell.com using php and preg_replace. I successfully extracted the postal code but only along with the address. Here is an example

$URL = "http://www.yell.com/ucs/UcsSearchAction.do?scrambleSeed=17824062&keywords=shop&layout=&companyName=&location=London&searchType=advance&broaderLocation=&clarifyIndex=0&clarifyOptions=CLOTHES+SHOPS|CLOTHES+SHOPS+-+LADIES|&ooa=&M=&ssm=1&lCOption32=RES|CLOTHES+SHOPS+-+LADIES&bandedclarifyResults=1";

//get yell.com page in a string $htmlContent = $baseClass->getContent($URL); //get postal code along with the address $result2 = preg_match_all("/(.*)</span>/", $htmlContent, $matches);

print_r($matches);

The above code ouputs something like Array ( [0] => Array ( [0] => 7, Royal Parade, Chislehurst, Kent BR7 6NR [1] => 55, Monmouth St, London, WC2H 9DG .... the problem that I have is that I don't know how to extract only the postal code without the address because it doesn't have an exact number of digits (sometimes it has 6 digits and sometimes has only 5 times). Basically I should extract the lasted 2 words from each array . Thank you in advance for any help !


quick & dirty:

# your array item
$string = "7, Royal Parade, Chislehurst, Kent BR7 6NR";

# split on spaces
$bits = preg_split('/\s/', $string);

# last two bits
end($bits);
$postcode = prev($bits) . " " . end($bits);

echo $postcode;

See it run at: code pad


If you just need to match the last two words in a string, you can use this regex:

\b\w+\s+\w+$

This will match what it says: a word boundary, some non-empty word, some white spaces, then another word, followed by end of string anchor.

<?php

$text = "7, Royal Parade, Chislehurst, Kent BR7 6NR";
$result =   preg_match("/\\b\\w+\\s+\\w+$/", $text, $matches);
print_r($matches);

?>

This prints:

Array
(
    [0] => BR7 6NR
)

You may also make the regex more robust by allowing optional trailing white spaces after the last word \s*, etc, but using the $ is the main idea.

0

精彩评论

暂无评论...
验证码 换一张
取 消