I have created a GAE app that parses RSS feeds using cElementTree. Testing on my local installation of GAE works fine. When I uploaded this app and tried to test it, I get a SyntaxError.
The error is :
Traceback (most recent call last): File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 509, in __call__
handler.post(*groups) File "/base/data/home/apps/palmfeedparser/1-6.339910418736930444/pipes.py", line 285, in post
tree = ET.parse(urlopen(URL)) File "<string>", line 45, in parse File "<string>", line 32,
in parse SyntaxError: no element found: line 14039, column 45
I did what Mr.Alex Martelli suggested and it printed out the following on my local machine:
[
' <ac:tag><![CDATA[Mobilit\xc3\xa4t]]></ac:tag>\n',
' </ac:tags>\n',
' <ac:images>\n',
' <ac:image ac:number="1">\n',
' <ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>\n'
]
I uploaded the app and it printed out:
[
' <ac:tag><![CDATA[Mobilit\xc3\xa4t]]></ac:tag>\n',
' </ac:tags>\n',
开发者_运维技巧 ' <ac:images>\n',
' <ac:image ac:number="1">\n',
' <ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>\n'
]
These lines correspond to the following lines in the RSS feed I am reading:
<ac:tags>
<ac:tag><![CDATA[Mobilität]]></ac:tag>
</ac:tags>
<ac:images>
<ac:image ac:number="1">
<ac:asset_url ac:type="app">http://cdn.downloads.example.com/public/1198/de/images/1/A/01.png</ac:asset_url>
I notice that there is a newline before the closing ac:tags. Line 14039 corresponds to this new line.
Update:
I use urllib.urlopen to access the URL of the feed. I displayed the contents it fetches both locally and on GAE proper. Locally, no content is truncated. Testing after uploading the app, shows that the feed that has 15289 lines is truncated to 14185 lines.
What method can I use to fetch this huge feed? Would urlfetch work?
Thanks in advance for your help!
A_iyer
You may have run into one of the mysterious limits placed on GAE.
Urlopen has been overridden by google to it's urlfetch method, so there shouldn't be any difference in it. (though it might be worth trying, there are a lot of hidden things in GAE)
newline characters shouldn't effect cElementTree.
Are there any other logging messages coming through in your AppEngine Logs? (Relating to the urlopen request?)
精彩评论