I'm trying to extract the users who ask questions on a classified ad website(http://trademe.co.nz/Trade-Me-Motors/Cars/Toyota/Hiace/auction-300294634.htm) . For some reasons the pattern which I'm using is not working always so I would appreciate if you will help me with a perfect regex Here is my current code
/get memberid of the question asker $pattern = "//m"; preg_match_all($pattern, $htmlContent, $member_match); $no_a = count($member_match[1];); $inc 开发者_Go百科= 0; echo "number of askers is $no_a"; //make loop to get all the members while($inc "; //get member user match based on the member_id $pattern2 = "/(.*)/"; preg_match_all($pattern2, $htmlContent, $member_user_match); $bid_user_q = $member_user_match[1][0]; //store the askers mysql_query("INSERT INTO askers (id, item_number, bid_user_q, bid_member_id_q, sub_cat) VALUES('', '$item_number', '$bid_user_q', '$bid_member_id_q', '$sub_cat')"); echo "INSERT INTO askers (id, item_number, bid_user_q, bid_member_id_q) VALUES('', '$item_number', '$bid_user_q', '$bid_member_id_q', '$sub_cat')"; mysql_error(); $inc++; }
The code doesn't seem to be displayed properly due the html tags from pattern so you can see it here http://pastebin.com/iPxizy5X
I doubt it is "perfect", but this one worked for me:
/<small>\s*<a href=\"\/Members\/Listings\.aspx\?member=(\d+)\">\s*<b>(.*?)<\/b>/
If you use:
$pattern = "/<small>\s*<a href=\"\/Members\/Listings\.aspx\?member=(\d+)\">\s*<b>(.*?)<\/b>/";
preg_match_all($pattern, $htmlContent, $member_match, PREG_SET_ORDER);
$member_match[0][1] = member id $member_match[0][2] = member nick $member_match[1][1] = member id $member_match[1][2] = member nick
精彩评论