开发者

PHP XML Cannot parse the CDATA

开发者 https://www.devze.com 2023-04-04 02:38 出处:网络
I am parsing into PHP an RSS feed from the national data buoy center. I am not able to parse in the description which is tagged as CDATA. The end goal is to have the description items variables such a

I am parsing into PHP an RSS feed from the national data buoy center. I am not able to parse in the description which is tagged as CDATA. The end goal is to have the description items variables such as Location, Wind Direction, Wind Speed, etc.. I am unsure how to break this out and omit the tags.

Here is a snippet of the feed:

<item>
  <pubDate>Thu, 08 Sep 2011 17:59:39 UT</pubDate>
  <title>Station SFXC1 - SAN FRANCISCO BAY RESERVE, CA</title>
  <descripti开发者_C百科on><![CDATA[
    <strong>September 8, 2011 9:45 am PDT</strong><br />
    <strong>Location:</strong> 38.223N 122.026W or 77 nautical miles S of search location of 39.5N 122.1W.<br />
    <strong>Wind Direction:</strong> W (270&#176;)<br />
    <strong>Wind Speed:</strong> 11 knots<br />
    <strong>Atmospheric Pressure:</strong> 30.03 in (1017.0 mb)<br />
    <strong>Air Temperature:</strong> 62&#176;F (16.9&#176;C)<br />
    <strong>Dew Point:</strong> 50&#176;F (10.2&#176;C)<br />
  ]]></description>

  <link>http://www.ndbc.noaa.gov/station_page.php?station=sfxc1</link>
  <guid>http://www.ndbc.noaa.gov/station_page.php?station=sfxc1&amp;ts=1315500300</guid>
  <georss:point>38.223 -122.026</georss:point>
</item>

Here is the PHP:

$feed_url = "http://www.ndbc.noaa.gov/rss/ndbc_obs_search.php?lat=39.5&lon=-122.1&radius=400";
$xmlString = file_get_contents($feed_url); 
$xmlString = str_replace('georss:point','point',$xmlString);  
$xml = new SimpleXMLElement($xmlString); 
$items = $xml->xpath('channel/item');
$closeItems = array(); 
$new_array = array(); 
foreach($items as &$item)  
{ 
echo "<br>";
$item_title = $item->title;
$item_title = mb_convert_case($item_title, MB_CASE_UPPER, "UTF-8");
list($lat, $lng) = explode(' ',trim($item->point));
   echo $item_title;
echo "<br>";     
echo $lat;
echo "<br>";
echo $lng;
echo "<br>";
echo $item->description;
echo "<br>";
echo $item->pubDate;
echo "<br>";
} 


In situations like this, don't just echo the expected value and give up when it's empty and come crying to SO (just kidding about the crying part).

Use PHP's var_dump or print_r to see what you're really getting. Is it NULL? Is it the empty string? Is it some other SimpleXMLElement object you need to descend into?

Not only will this make your question more informative and likely to be answered, but you'll probably end up solving the problem yourself (and then posting an answer here for other people who stumble upon it in the future).


Rewrote my solution to actually be correct:

$feed_url = "http://www.ndbc.noaa.gov/rss/ndbc_obs_search.php?lat=39.5&lon=-122.1&radius=400";
$xmlString = file_get_contents($feed_url); 
$xmlString = str_replace('georss:point','point',$xmlString);  
$xml = new SimpleXMLElement($xmlString); 
$items = $xml->xpath('channel/item');
foreach($items as $item) { 

$item_title = mb_convert_case($item->title, MB_CASE_UPPER, "UTF-8");
$description = mb_convert_case(str_replace('        ', '', trim(html_entity_decode(strip_tags($item->description)))), MB_CASE_UPPER, "UTF-8");

list($lat, $lng) = explode(' ',trim($item->point));

echo $item_title . PHP_EOL . $lat . ' x ' . $lng . PHP_EOL . 'published: ' . $item->pubDate . PHP_EOL . 'Description: ' . PHP_EOL . $description . PHP_EOL . PHP_EOL;
}

I took the CDATA removed the tags, decoded the html entities, and removed the pesky white space. A regex might be better in removing the white space.

0

精彩评论

暂无评论...
验证码 换一张
取 消