开发者

PHP simpleXML trying to process fairly complex file

开发者 https://www.devze.com 2023-02-24 22:08 出处:网络
The file I have to work with has the following structure: <?xml version=\"1.0\" encoding=\"UTF-8\" ?>

The file I have to work with has the following structure:

<?xml version="1.0" encoding="UTF-8" ?>
<FormattedReport xmlns = 'urn:crystal-reports:schemas' xmlns:xsi = 'http://www.w3.org/2000/10/XMLSchema-instance'>
    <FormattedAreaPair Level="0" Type="Report">
    <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
 <FormattedAreaPair Level="1" Type="Details">
    <FormattedArea Type开发者_开发问答="Details">
        <FormattedSections>
        <FormattedSection SectionNumber="0">
        <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
    <FormattedReportObjects>
        FormattedReportObject xsi:type="CTFormattedField" Type="xsd:long" FieldName="{tblCon.ManifestNR}"><ObjectName>ManifestNR1</ObjectName>
        <FormattedValue>1,907</FormattedValue>
        <Value>1907.00</Value>
        </FormattedReportObject>
        <FormattedReportObject xsi:type="CTFormattedField" Type="xsd:timeInstant" FieldName="{tblCon.ShippingDate}"><ObjectName>ShippingDate1</ObjectName>
        <FormattedValue>14/04/2011</FormattedValue>
        <Value>2011-04-14T00:00:00</Value>
        </FormattedReportObject>
        ... so on and so forth ...
        </FormattedReportObjects>
        </FormattedSection>
        </FormattedSections>
        </FormattedArea>
        </FormattedAreaPair>
        </FormattedAreaPair>
        </FormattedReport>

So what I'm trying to do, is call a PHP function which will parse the XML and eventually store it in an SQL DB.

for example:

ManifestNR: 1903 ShippingDate: 12/04/2011 CarrierID: TNT03 TrackingRef: 234234232 ... etc for each record ...

so i've set about trying to do this using DOM and then stumbled across simpleXML, I've read several tuts, and searched implementations here but I just can't seem to access the data in the final nodes (or any other data tbh). Is simpleXML a no-no with these kind of structures?

The latest PHP I'm using is:

<?php

if (file_exists('tracking.xml')) {
    $xml = simplexml_load_file('tracking.xml');

  //  print_r($xml);

   foreach( $xml as $FormattedReport->FormattedAreaPair->FormattedAreaPair ) 
        {
        foreach($FormattedReport as $node->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects)
        echo $node->FormattedReportObject->Value;
        }

} else {
    exit('Failed to open xml');
}
?>

I've tried to strip it right back to basics, but still no luck. Doesn't echo a result.

Thanks for your time guys!

SOLVED

Anyone in similar circumstances heres a bit of direction.

  1. ignore the root node, thats your default $variable when you import the XML string/file
  2. If you have nested groups create a node to the parent first like so $xml->FormattedAreaPair->FormattedAreaPair as $parentnode
  3. Using your parent node loop through all the children
  4. If you have an attribute field access it as follows: (string) $node['FieldName'])
  5. Compare the retrieved attribute with a string and then handle the result.
  6. Stop pulling your hair out.

    //print_r($xml); foreach( $xml->FormattedAreaPair->FormattedAreaPair as $parentnode ) { foreach($parentnode->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $node){ //echo "FormattedValue: ".$node->FormattedValue."<br />"; switch((string) $node['FieldName']){ case '{tblCon.ManifestNR}': echo 'Manifest: '.$node->FormattedValue."<br />"; break; case '{tblCon.ShippingDate}': echo 'Shipping Date: '.$node->FormattedValue."<br />"; break; case '{tblCon.CarrierID}': echo 'Carrier ID: '.$node->FormattedValue."<br />"; break; case '{tblCon.CustConRefTX}': echo 'Customer Reference: '.$node->FormattedValue."<br />"; break; case '{tblCon.ServiceCodeTX}': echo 'Service Code: '.$node->FormattedValue."<br />"; break; case '{tblCon.TotalWeightNR}': echo 'Total Weight: '.$node->FormattedValue."<br />"; break; case '{tblCon.ValueNR}': echo 'Value: '.$node->FormattedValue."<br />"; break; case '{tblCon.TotalVolumeNR}': echo 'Total Volume: '.$node->FormattedValue."<br />"; break; case '{tblCon.GoodsDesc}': echo 'Goods Description: '.$node->FormattedValue."<br />"; break; case '{tblConAddr.ReceiverNameTX}': echo 'Receiver Name: '.$node->FormattedValue."<br />"; break; case '{@SalesOrder}': echo 'Sales Order: '.$node->FormattedValue."<br />"; break; case '{@TrackingReference}': echo 'Tracking Reference: '.$node->FormattedValue."<br />"; break; } } echo "---------------------------- <br />"; } } else { exit('Failed to open xml'); } ?>


The examples in the Manual should suffice (Example #4 in particular). You seem like a sufficiently clever fellow. The problem is that you're doing it wrong.

example.php

<?php
$xmlstr = <<<XML
<?xml version='1.0' standalone='yes'?>
<movies>
 <movie>
  <title>PHP: Behind the Parser</title>
  <characters>
   <character>
    <name>Ms. Coder</name>
    <actor>Onlivia Actora</actor>
   </character>
   <character>
    <name>Mr. Coder</name>
    <actor>El Act&#211;r</actor>
   </character>
  </characters>
  <plot>
   So, this language. It's like, a programming language. Or is it a
   scripting language? All is revealed in this thrilling horror spoof
   of a documentary.
  </plot>
  <great-lines>
   <line>PHP solves all my web problems</line>
  </great-lines>
  <rating type="thumbs">7</rating>
  <rating type="stars">5</rating>
 </movie>
</movies>
XML;
?>

Example #4

<?php
include 'example.php';

$xml = new SimpleXMLElement($xmlstr);

/* For each <character> node, we echo a separate <name>. */
foreach ($xml->movie->characters->character as $character) {
   echo $character->name, ' played by ', $character->actor, PHP_EOL;
}

?>

Notice that when using the foreach construct you need to specify the path to the nodes of a certain type. The second item in the foreach is just an (empty) variable that you use to store the current node in the iteration.


How to access attributes like i:nil with simplexml ( XMLSchema-instance ) :

Xml :

<item i:nil="true"/>

Php :

(bool) $item->attributes('i',true)->nil;


The file I was dealing with was ~1GB so I couldn't load the xml file all at once. Here's the CI controller I made to parse the Crystal Reports XML.

<?php

class Parse_crystal_reports_xml extends CI_Controller {

    function index(){
        $base_path = "/path/to/xml/";
        $xml_file = "xml_file.xml";
        $file_header = '<?xml version="1.0" encoding="UTF-8" ?>';
        $separator = '<FormattedAreaPair Level="1" Type="Details">';
        $xml_data = explode($separator, str_replace($file_header, '', file_get_contents($base_path.$xml_file)));
        $bad_names = array('xsi:','xsd:');
        foreach($xml_data as $block_num => $block) : 
            if(!$block_num) : continue; endif;
            $fields = new SimpleXMLElement(str_replace($bad_names, '', $file_header."\n".$separator.$block));
            $temp_array = array();
            foreach($fields->FormattedArea->FormattedSections->FormattedSection->FormattedReportObjects->FormattedReportObject as $field_num => $field) :
                // print_r($field);
                $temp_array[$this->make_slug($field['FieldName'])] = $this->clean_word((string)$field->FormattedValue);
            endforeach;
            // print_r($fields);
            print_r($temp_array);
            die;
        endforeach;
    }

    function make_slug($string){
        return strtolower(trim(preg_replace('/\W+/', '_', $string), '_'));
    }

    function clean_word($string){
        return trim(preg_replace('/\s+/', ' ', $string));
    }
}
?>
0

精彩评论

暂无评论...
验证码 换一张
取 消