I would like to take data from the Facebook Graph API and analyze it to find out roughly how close one person is to another. I am attempting to use the Pylons framework with SqlAlchemy (right now it is attached to a SQLite database) to store information from the Graph API so that I can make it available to my other applications via a RESTful web service. I am wondering what would be the best approach to analyzing the data.
For example, should I create objects analogous to the nodes and edges in the Graph API (users, posts, statuses, etc.) and analyze them, then store only the aftermath of that analysis in the database, perhaps the UIDs of each node and its connections to other nodes? Or should I store even less, and only have a database of the users and their close friends? Or should I go through step by step and store each of the objects via the ORM mapper in the database and make the analysis from the database after having filled it?
What sorts of concerns go into the designing of a database in situations like this? How should objects relate/开发者_如何学运维map to the model? Where should the analysis be taking place during the whole process of grabbing data and storing it?
I'd store as much as possible, dump everything you can. Try to maintain the relationships between nodes so you can traverse/analyze them later. This affords you the opportunity to analyze your data set as much as you want, over and over and try different things. If you want to use SQLAlchemy you could use a simple self-referential relationship: http://www.sqlalchemy.org/docs/05/mappers.html#adjacency-list-relationships. That way you can maintain the connections between objects easily, and easily traverse them. You should also think about using MongoDB. It's pretty nice for this sort of thing, you can pretty much just dump the JSON responses you get from Facebook into MongoDB. It also has a great python client. Here's the MongoDB docs on storing a tree in MongoDB: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB. There are a couple approaches that make sense there.
精彩评论