开发者

Retrieve XML from a third party page in PHP

开发者 https://www.devze.com 2023-01-04 00:40 出处:网络
I need read in and parse d开发者_Go百科ata from a third party website which sends XML data. All of this needs to be done server side.

I need read in and parse d开发者_Go百科ata from a third party website which sends XML data. All of this needs to be done server side.

What is the best way to do this using PHP?


You can obtain the remote XML data with, e.g.

$xmldata = file_get_contents("http://www.example.com/xmldata");

or with curl. Then use SimpleXML, DOM, whatever.


A good way of parsing XML is often to use XPP (XML Pull Parsing) librairy, PHP has an implementation of it, it's called XMLReader.

http://php.net/manual/en/class.xmlreader.php


I would suggest you to use DOMDocument (PHP inline built class) A simple example of its power could be the following code:

   /***********************************************************************************************
   Takes the RSS news feeds found at $url and prints them as HTML code.
   Each news is rendered in a <div class="rss"> block in the order: date + title + description. 
   ***********************************************************************************************/
   function Render($url, $max_feeds = 1000)
   {   
      $doc = new DOMDocument();

      if(@$doc->load($url, LIBXML_NOCDATA|LIBXML_NOBLANKS))
      {
         $feed_count = 0;
         $items = $doc->getElementsByTagName("item");
         //echo $items->length; //DEBUG
         foreach($items as $item)
         {              
                if($feed_count > $max_feeds)
                   break;

                //Unfortunately inside <item> node elements are not always in same order, therefor we have to call many times getElementsByTagName
                //WARNING: using iconv function instead of utf8_decode because this last one did not convert properly some characters like apostrophe 0x19 from techsport.it feeds.
                $title = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("title")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"
                $description = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("description")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"
                $link = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("link")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"

                //pubDate tag is not mandatory in RSS [RSS2 spec: http://cyber.law.harvard.edu/rss/rss.html]
                $pub_date = $item->getElementsByTagName("pubDate"); $date_html = "";
                //play with date here if you want

                echo "<div class='rss'>\n<p class='title'><a href='" . $link . "'>" . $title . "</a></p>\n<p class='description'>" . $description . "</p>\n</div>\n\n";

                $feed_count++;
        }
      }
      else
         echo "<div class='rss'>Service not available.</div>";
   }


I have been using simpleXML for a while.

0

精彩评论

暂无评论...
验证码 换一张
取 消