开发者

Editing the XML texts from a XML file using Python

开发者 https://www.devze.com 2022-12-14 04:00 出处:网络
I have an XML file which contains some data as given. <?xml version=\"1.0\" encoding=\"UTF-8\" ?>

I have an XML file which contains some data as given.

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I have one text file with lines as

Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3   
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3

Now I want to edit XML texts; i,e. I want to replace the field (n/a) with the corresponding values from the text file. Like I want the file to looks like

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>TRUE</Value> 
  <Result>TRUE</Result> 开发者_Go百科
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>19-Flat2-HS3</Value> 
  <Result>19-Flat2-HS3</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>FALSE</Value> 
  <Result>FALSE</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>4-1-Bead1-HS3</Value> 
  <Result>4-1-Bead1-HS3</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I am new to this Python-XML coding. I dont have idea about how to edit the text fields in a XML file. I am trying to Use elementtree.ElementTree module. but to read the lines in XML file and extract the attributes I dont know which modules need to be imported.

Please help.

Thanks and Regards.


You can convert your data text into python dictionary by regular expression

data="""Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3"""

#data=open("data.txt").read()

import re

data=dict(re.findall('(Spec \d+ (?:Included|Label))\s*:\s*(\S+)',data))

data will be as follows

{'Spec 3 Included': 'FALSE', 'Spec 2 Included': 'TRUE', 'Spec 3 Label': '4-1-Bead1-HS3', 'Spec 2 Label': '19-Flat2-HS3'}

Then you can convert it by using any of your favoriate xml parser, I will use minidom here.

from xml.dom import minidom

dom = minidom.parseString(xml_text)
params=dom.getElementsByTagName("Parameter")
for param in params:
    name=param.getAttribute("name")
    if name in data:
        for item in param.getElementsByTagName("*"): # You may change to "Result" or "Value" only
            item.firstChild.replaceWholeText(data[name])

print dom.toxml()

#write to file
open("output.xml","wb").write(dom.toxml())

Results

<?xml version="1.0" ?><ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj"/>
  <ParameterList count="85">
    <Parameter mode="both" name="Spec 2 Included" type="boolean">
      <Value>TRUE</Value>
      <Result>TRUE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 2 Label" type="string">
      <Value>19-Flat2-HS3</Value>
      <Result>19-Flat2-HS3</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Included" type="boolean">
      <Value>FALSE</Value>
      <Result>FALSE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Label" type="string">
      <Value>4-1-Bead1-HS3</Value>
      <Result>4-1-Bead1-HS3</Result>
    </Parameter>
  </ParameterList>
</ParameterData>


Well, you could start with

import xml.etree.ElementTree as ET
tree = ET.parse("blah.xml")

Find the elements you want to modify.

To replace the contents of an element, just do

element.text = "TRUE"

The import statement above works in Python 2.5 or later. If you have an older version of Python you'll need to install ElementTree as an extension, and then the import statement is different: import elementtree.ElementTree as ET.


Unfortunately, the XPath supported by ElementTree isn't complete. Since Python 2.6 includes an older version, finding elements by attribute (as stated here) does not work. So Python's own documentation should be your first stop: xml.etree.ElementTree

import xml.etree.ElementTree as ET

original = ET.parse("original.xml")
parameters = original.findall(".//Parameter")
changes = {}

# read changes
with open("changes.txt", "rb") as in_file:
    for change in in_file:
        change = change.rstrip()                # remove line endings
        name, value = change.split(":")
        changes[name.strip()] = value.strip()   # remove whitespaces

# find paramter element and apply changes
for parameter in parameters:
    parameter_name = parameter.get("name")
    if changes.has_key(parameter_name):                
        value = parameter.find("./Value")
        value.text = changes[parameter_name]
        result = parameter.find("./Result")
        result.text = changes[parameter_name]

original.write("new.xml")


Here is how you could do it using Amara

from amara import bindery

doc = bindery.parse(XML)

def cleanup_for_dict(key, value):
    return key.strip(), value.strip()

params = dict(( cleanup_for_dict(*line.split(':', 1))
                for line in TEXT.splitlines()))

for param in doc.ParameterData.ParameterList.Parameter:
    if param.name in params:
        param.Value = params[param.name]
        param.Result = params[param.name]

doc.xml_write()
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号