For my own personal purposes, I have about ~300 author开发者_如何学Gos (full name) of various books. I want to partition this list into "fiction authors" and "non-fiction authors". If an author writes both, then the majority gets the vote.
I looked at Amazon Product Search API: I can search by author (in Python), but there is no way to find the book category (fiction vs rest):
>>> node = api.item_search('Books', Author='Richard Dawkins')
>>> for book in node.Items.Item:
... print book.ItemAttributes.Title
What are my options? I prefer to do this in Python.
Well, you can try another service - Google Book Search API. To use Python you can have a look at gdata-python-api. In its protocol, in result feed there is a node <dc:subject>
- probably that's what you need:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
xmlns:gbs="http://schemas.google.com/books/2008"
xmlns:dc="http://purl.org/dc/terms"
xmlns:gd="http://schemas.google.com/g/2005">
<id>http://www.google.com/books/feeds/volumes</id>
<updated>2008-08-12T23:25:35.000</updated>
<!-- a loot of information here, just removed those nodes to save space.. -->
<dc:creator>Jane Austen</dc:creator>
<dc:creator>James Kinsley</dc:creator>
<dc:creator>Fiona Stafford</dc:creator>
<dc:date>2004</dc:date>
<dc:description>
If a truth universally acknowledged can shrink quite so rapidly into
the opinion of a somewhat obsessive comic character, the reader may reasonably feel ...
</dc:description>
<dc:format>382</dc:format>
<dc:identifier>8cp-Z_G42g4C</dc:identifier>
<dc:identifier>ISBN:0192802380</dc:identifier>
<dc:publisher>Oxford University Press, USA</dc:publisher>
<dc:subject>Fiction</dc:subject>
<dc:title>Pride and Prejudice</dc:title>
<dc:title>A Novel</dc:title>
</entry>
</feed>
Of course, this protocol gives you some overhead information, related to this book (like visible or not on Google Books etc.)
Did you look at BrowseNodes
? To me (who has not been using this API before) it seems BrowseNodes
correspond to Amazon's product categories. Maybe you find more information there.
After spending some time messing with the Amazon API it looks like they don't provide the kind of information you want.
They don't mention categories of that type in their documentation and if you serialise the stuff the api sends you there is not a single mention of fiction or non-fiction catergories.
You can use this to print out a nice XML string (you might want to direct it at a file for easy reading) with all of the stuff the api sends.
from lxml import etree
node = api.item_search('Books', Author='Richard Dawkins')
print etree.tostring(node, pretty_print=True)
精彩评论