开发者

Convert large XML file to CSV in PHP

开发者 https://www.devze.com 2022-12-18 10:42 出处:网络
I have a 50MB XML file.I want to convert it to a CSV file, but most methods I have 开发者_运维百科found exhaust the server memory.Is there a good way to do this using a stream method such as XMLreader

I have a 50MB XML file. I want to convert it to a CSV file, but most methods I have 开发者_运维百科found exhaust the server memory. Is there a good way to do this using a stream method such as XMLreader.


You'd want to use XmlReader to parse the XML, as it works as an event based parser - Eg. it doesn't load everything into memory, but rather reads as it advances through the input file.


the SAX-style expat-based parser is the most space-efficient option:

http://php.net/xml_parse

it will execute your $start_element_handler and $end_element_handler callbacks whenever an element tag is opened or closed, rather than keeping the entire document in memory.

but still, 50 MB is not a lot, maybe your provider can up the limit.

php_value memory_limit 100M

in .htaccess/httpd.conf, or set it in php.ini.


I've written this algorithm some time ago.. Feel free to give it a shot.

https://web.archive.org/web/20120423125804/http://sites.google.com/site/soichih/q-a/xml-to-csv-converter


Late to the party...

for an xml structure of <domains><domain><name>myname.com</name></domain></domains>

$url = "http://mysite.com/my.xml";
  $returnData = file_get_contents($url);
  $xml = simplexml_load_file($url);

     $csv = 'my.csv';
     $path = '/var/www/html/';

  $domain = $xml->domains->domain;

      $fullpath = $path.$csv;
      $fp = fopen($fullpath, 'w');

    foreach ($xml->domains->domain as $domain) {

        fputcsv($fp, get_object_vars($domain),',','"');

    }

    fclose($fp);

       header('Content-Description: File Transfer');
           header('Content-Type: application/csv');
       header('Content-Disposition: attachment; filename='.basename($csv));
       header('Content-Transfer-Encoding: binary');
       header('Expires: Mon, 26 Jul 1997 05:00:00 GMT');
       header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
       header('Pragma: public');
       header('Content-Length: ' . filesize($fullpath));
       readfile($fullpath);

    exit;
    }
}


Have you tried to increase memory limit ? ini_set('memory_limit', '256M')

(That's a very bad solution btw)


I don't know much about PHP API, but seems this class can help you: XML Parser

Basically you're looking for a parser based on events, like old SAX. This parser type will fire an event, or something similar. It'll be memory efficient, as it doesn't need to load your entire document into memory.


If the XML file is rather simple and could avoid going through a full-fledged XML parser, and could instead be read line-by-line by PHP and export each line as it goes, that would save having the whole file in memory at once. What's the XML structure?

0

精彩评论

暂无评论...
验证码 换一张
取 消