Cassandra has to option to enable "ReadRepair". A Read is send to all Replicas and if one is stale, it will be fixed/updated. But due to the fact, that all replicas receive the Read, there will be the point, when the nodes reach IO-Saturation. As always ALL replica nodes receive the read, adding further nodes will not help, as they also receive all reads (and will be saturated at once)?
Or does cassandra offer some "tunabililty" to configure that ReadRepair does only go to not all of the nodes (or offer any other "replication" that will allow true read scaling)?
thanks!! jens
Update: A Concrete exmaple, as I still do not understand how it will work in practice.
- 9 Cassandra "Boxes/Severs"
- 3 Replicas (N=3) => Every "Row" is written to 2 additinal Nodes = 3 Boxes hold the data in total)
- ReadRepair Enabled
- The Row in Question is (Lets say customer1) is highly trafficed
1.) The开发者_开发知识库 first Time I write the Row "Customer1" to Cassandra it will evantually be available on all 3 nodes.
2.) Now I query the system with 1000's of Requests of requests per second for Customer1 (and to make it more clear with any caching disabled).
3.) The Read will always be dispateched to all 3 nodes. (The first request (to the nearest node) will be a full request for data and the two additional requests will only be a "checksum request".)
4.) As we are queryingw with 1000's of requests, we reach the IO-limit of all Replicas! (The IO is the same on all 3 nodes!! (only the bandwith is much smaller on the checksum nodes).
5.) I add 3 further Boxes (so we have 12 Boxes in Total):
A) These Boxes does NOT have the Data yet (to help scale linearly). I first have to get the Customer1 Record to at least one of this new Boxes. =>This means I have to Change the replication Factor to 4 (OR is there any other option to get the data to another box?)
And now we have the same problem. The Replication Factor is now 4. And all 4 Boxes will receive the Read(Repair)Requst for this highly trafficed customer1 row. This does not scale this way. Scaling would only work if we have Copy that will NOT receive the ReadRepair Request.
What is wrong in my understanding?? My Conculsion: With Standard ReadRepair the System will NOT scale linearly (for a single highly trafficed row), as adding further boxes will also lead to the fact that these boxes also receive the ReadRepair requests (for this trafficed row)...
Thanks very much!!!Jens
Adding further nodes will help (in most situations). There will only be N read repair "requests" during a read, where N is the ReplicationFactor (number of replicas, nb. not the # of nodes in the entire cluster). So the new node(s) will only be included in a read / read repair if the data you request is included in the nodes key range (or is holding a replica of the data).
There is also the read_repair_chance tunable per ColumnFamily, but that is a more advanced topic and doesn't change the fundamental equation that you should scale reads by adding more nodes, rather than de-tuning read repair.
You could read more about replication and consistency from bens slides
精彩评论