开发者

XML Encoding error while writing it in to file

开发者 https://www.devze.com 2023-04-08 22:19 出处:网络
I think I am following the right approach but I am still getting an encoding error: from xml.dom.minidom import Document

I think I am following the right approach but I am still getting an encoding error:

from xml.dom.minidom import Document
import codecs

doc = Document()
wml = doc.createElement("wml")
doc.appendChild(wml)

property = doc.createElement("property")
wml.appendChild(property)

descriptionNode = doc.createElement("description")
property.appendChild(descriptionNode)
descriptionText = doc.createTextNode(description.decode('ISO-8859-1'))
descriptionNode.appendChild(descriptionText)

file = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1')
file.write(doc.toprettyxml())
file.close()

The description node contains some characters in ISO-8859-1 encoding, this is encoding specified by the site it self in meta tag. But when doc.toprettyxml() starts writing in file I got following error:

Traceback (most recent call last):
File "main.py", line 467, in <module>
    file.write(doc.toprettyxml())
File "C:\Python27\lib\xml\dom\minidom.py", line 60, in toprettyxml
    return writer.getvalue()
File "C:\Python27\lib\StringIO.py", line 271, in getvalue
    self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 10: ordinal not in range(128)

Why am I ge开发者_运维知识库tting this error as I am decoding and encoding with same standard?

Edited

I have following deceleration in my script file:

#!/usr/bin/python
# -*- coding: utf-8 -*-

may be this is conflicting?


Ok i have found a solution. When ever data is in other foriegn language you just need to defined the proper encoding in xml header. You do not need to describe encoding in file.write(doc.toprettyxml(encoding='ISO-8859-1')) not even when you are opening a file for writing file = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1'). Below is the technique which i used. May be This is not a professional method but that works for me.

file = codecs.open('abc.xml', 'w')
xm = doc.toprettyxml()
xm = xm.replace('<?xml version="1.0" ?>', '<?xml version="1.0" encoding="ISO-8859-1"?>')
file.write(xm)
file.close()

May be there is a method to set default encoding in header but i could not find it. Above method does not bring any error on browser and all data display perfectly.

0

精彩评论

暂无评论...
验证码 换一张
取 消