Python and XML Processing_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-27 07:18 出处：网络

I have used urllib to get the following data: <?xml version=\"1.0\" encoding=开发者_Go百科\"UTF-8\" standalone=\"yes\"?>

相关专题：python xml

I have used urllib to get the following data:

<?xml version="1.0" encoding=开发者_Go百科"UTF-8" standalone="yes"?>
<videos xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        xmlns:www="http://www.www.com"">
  <video type="cl">
    <cd>
      <src lang="music">http://www.google.com/ </src>
    </cd>
  </video>
</videos>

I want to get http://www.google.com/ out, here is my code:

import xml.etree.ElementTree as etree
data='<?xml version="1.0" encoding="UTF-8" standalone="yes"?><videos xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:www="http://www.www.com""><video type="cl"><cd><src lang="music">http://www.google.com/ </src></cd></video></videos>'
tree = etree.fromstring(data)
geturl=tree.findtext('/video/cd/src').strip()
print geturl

I get error:

AttributeError: 'NoneType' object has no attribute 'strip'

Obviously, the findtext failed. I tried findtext('src'), also wont work.

Whats wrong?

Remove the first forward-slash from the path: video/cd/src:

import xml.etree.ElementTree as etree
data='''<?xml version="1.0" encoding="UTF-8" standalone="yes"?><videos xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:www="http://www.www.com"><video type="cl"><cd><src lang="music">http://www.google.com/ </src></cd></video></videos>'''
tree = etree.fromstring(data)
geturl=tree.findtext('video/cd/src').strip()
print geturl

yields