In many REST based API calls, we have this parameter called nextURL, using whic开发者_如何学Ch we can query for the next URL. This is usually in the root element.(or may be the next one)
In general how do you guys read this? In case you are using a standard XML parser, it reads and loads the entire XML and then you get to read the nextURL by getElementsByTag. Is there a better work around? Reading the entire xml is of course waste of time/memory.
Edit: An example XML would be something like
<result pubisher="xyz" nextURL="http://actualurl?since_date=<newdate>">
<element>adfsaf</element>
..
</result>
I need to capture the new since_date without reading the entire XML.
Python: You could use the ElementTree iterparse method ... provided the data you want is in an attribute, which will have been parsed by the time that you get the start event. If it's in the text or tail of the element, you will have to wait until the end event. It would be a good idea if you edited your question to show what your XML looks like, and explain "or maybe in the next one" with an example.
The term "Standard XML parser" covers a lot of territory, so much so that I don't think that you can generalize their behaviors. For instance, a standard DOM parser is tree-based and will read the entire XML into memory, but a SAX parser (and I think StAX as well) won't but rather will advance as the app desires it to advance. It sounds like the latter, a SAX or StAX parser, is what you need.
Edit: Please be sure to read KitsuneYMG's comment below on the difference between SAX and StAX behaviors.
精彩评论