开发者

Trouble getting a list of cities from LS

开发者 https://www.devze.com 2023-04-07 04:03 出处:网络
I\'m struggling getting an array of LS cities... file_get_contents() returns an empty dropdown on their roadblock requiring you to select cities.Unfortunately it\'s empty... so then I thought it was c

I'm struggling getting an array of LS cities... file_get_contents() returns an empty dropdown on their roadblock requiring you to select cities. Unfortunately it's empty... so then I thought it was coming from an ajax request. But looking at the page I don't see any ajax requests on the page. Then I tried CURL, thinking that maybe simulating a browser would help... the below code had no affect.

$ch = curl_init("http://www.URL.com/");
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER开发者_运维技巧, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)');
$result=curl_exec($ch);
var_dump($result);

Does anyone have any ideas on how I can get a solid list of available areas?


I have found out how they populate the list of cities and created some sample code below you can use.

The list of cities is stored as a JSON string in one of their javascript files, and the list is actually populated from a different javascript file. The names of the files appear to be somewhat random, but the root name remains the same.

An example of the JS file with the city JSON is hXXp://a3.ak.lscdn.net/deals/system/javascripts/bingy-81bf24c3431bcffd317457ce1n434ca9.js The script that populates the list is hXXp://a2.ak.lscdn.net/deals/system/javascripts/confirm_city-81bf24c3431bcffd317457ce1n434ca9.js but for us this is inconsequential.

We need to load their home page with a new curl session, look for the unique javascript URL that is the bingy script and fetch that with curl. Then we need to find the JSON and decode it to PHP so we can use it.

Here is the script I came up with that works for me:

<?php

error_reporting(E_ALL); ini_set('display_errors', 1);  // debugging

// set up new curl session with options
$ch = curl_init('http://livingsocial.com');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13');

$res = curl_exec($ch); // fetch home page

// regex string to find the bingy javascript file
$matchStr = '/src="(https?:\/\/.*?(?:javascripts)\/bingy-?[^\.]*\.js)"/i';

if (!preg_match($matchStr, $res, $bingyMatch)) {
    die('Failed to extract URL of javascript file!');
}

// this js file is now our new url
$url = $bingyMatch[1];

curl_setopt($ch, CURLOPT_URL, $url);

$res = curl_exec($ch); // fetch bingy js

$pos = strpos($res, 'fte_cities'); // search for the fte_cities variable where the list is stored

if ($pos === false) {
    die('Failed to locate cities JSON in javascript file!');
}

// find the beginning of the json string, and the end of the line
$startPos = strpos($res, '{', $pos + 1);
$endPos   = strpos($res, "\n", $pos + 1);

$json = trim(substr($res, $startPos, $endPos - $startPos)); // snip out the json

if (substr($json, -1) == ';') $json = substr($json, 0, -1); // remove trailing semicolon if present

$places = json_decode($json, true); // decode json to php array

if ($places == null) {
    die('Failed to decode JSON string of cities!');
}

// array is structured where each country is a key, and the value is an array of cities
foreach($places as $country => $cities) {
    echo "Country: $country<br />\n";

    foreach($cities as $city) {
        echo '  '
            ."{$city['name']} - {$city['id']}<br />\n";
    }

    echo "<br />\n";
}

Some important notes:

If they decide to change the javascript file names, this will fail to work. If they rename the variable name that holds the cities, this will fail to work. If they modify the json to span multiple lines, this will not work (this is unlikely because it uses extra bandwidth) If they change the structure of the json object, this will not work.

In any case, depending on their modifications it may be trivial to get working again, but it is a potential issue. They may also be unlikely to make these logistical changes because it would require modifications to a number of files, and then require more testing.

Hope that helps!


Perhaps a bit late, but you don't need to couple to our JavaScript to obtain the cities list. We have an API for that:

https://sites.google.com/a/hungrymachine.com/livingsocial-api/home/cities

0

精彩评论

暂无评论...
验证码 换一张
取 消