I need to build an indexed database of whole domains in the world.
.
Example:
domain1.com ips: 1.1.1.1,2.2.2.2,3.3.3.3
domain2.com ips: 1.1.1.1,4开发者_开发技巧.4.4.4
requirements:
fast insertions
fast "selects"
index on ip's - need fast "select" for all domains on IP: 1.1.1.1 .
.
I built it in Berkley-DB , and it seems fine: ( please pay attention to the "MANY_TO_MANY" annotation )
.
@Entity
public static class DomainInfo {
@PrimaryKey String domain; @SecondaryKey(relate=MANY_TO_MANY) Set<String> IP = new HashSet<String>();
}
.
Can I build something like that in Cassandra ?
Thanks a lot !!!
.
Yes, its possible. You will get fast inserts for free using Cassandra. Fast "selects"? As long as you construct appropriate column families with reasonable index you will have fast "selects".
Index on ips. Fine, just create a second column family for that index. Or wait for the upcoming 0.7 relase (rc is about to released very soon, betas are available.) and use the built in support for secondary index.
You could build a lookup model with these two column families as an example:
DomainLookup = { 'domain1.com' : { 'ips' : '1.1.1.1,2.2.2.2,3.3.3.3' } 'domain2.com' : { 'ips' : '1.1.1.1,4.4.4.4' } } ReverseLookup = { '1.1.1.1' : { 'domains' : 'domain1.com,domain2.com } '2.2.2.2' : { 'domains' : 'domain1.com' } '3.3.3.3' : { 'domains' : 'domain1.com' } '4.4.4.4' : { 'domains' : 'domain2.com' } }
This example is probably not ideal for your case. But remember Cassandra is optimized for write. So you could create other indices best for your query scenario. Plus, Cassandra adopts Dynamo's fully distributed design which makes it easier to scale. It is self-managed meaning you could add a new machine to your Cassandra cloud and it will automatically balance the storage and load. One thing you need to pay attention is to choose either Random or Order Preserving Partitioning.
精彩评论