I am trying to implement a transaction system for Cassandra with the help of ZooKeeper. Since I don't think I have enough experience in database implementation, I would like to know if my idea would work in principle, or is there any major flaw.
Here is the high level description of the steps:
- identify all the rows(keys) and columns to be edited. Let the keys be [K0..Kn]
- apply write lock on all the rows involved (locks are in-memory Zookeeper implementation)
- copy the old values to separate locations in Cassandra which are uniquely identified by key: [K'0..K'n]
- store [K'0..K'n] and the mapping of them to [K0..Kn] in ZooKeeper using persistent mode
- go ahead apply the update to the data
- delete the entries in ZooKeeper
- unlock the rows
- delete the entries of [K'0..K'n] lazily on a maintenance thread (cassandra deletion uses timestamp, so K'0..K'n can be reused for another transaction with a newer time stamp)
Justification:
- if the transaction failed on step 1-4, no change is applied, I can abort the transaction and delete whatever is stored in zookeeper and backup-ed in cassandra, if any.
- if the transaction failed on s开发者_如何学JAVAtep 5, the information saved on step 3 is used to rollback the any changes.
- if the server happen to be failed/crashed/stolen by cleaning man, upon restart before serving any request, I check if there is any keys persisted in the zookeeper from step 4, if so, i will use those keys to fetch backed up data stored by step 3, and put those data to where they were, thus roll-back any failed transactions.
One of my concern is what would happen if some of the servers are partitioned from the cluster. I have no experience in this area, does my scheme work at all? and does it work if partition happens?
You should look into Cages: http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
http://code.google.com/p/cages/
精彩评论