开发者

HBase - Column family

开发者 https://www.devze.com 2023-02-28 10:50 出处:网络
I\'m a beginner in HBase. I need to design my table. I want to play with the following information: At the date XX-XX-XXXX, the word \'HELLO\' is in document 2,3,4 and the weight of each doc is 12,45

I'm a beginner in HBase. I need to design my table. I want to play with the following information:

At the date XX-XX-XXXX, the word 'HELLO' is in document 2,3,4 and the weight of each doc is 12,45,36 - My开发者_高级运维 raw data: doc:D title:'i like potatoes',weight:W,date:D

I created a table with, row: word, column:date, value:doc But I can't store multiple row with the same date.

Can we create multiple column families for a table? What can be the best way to design the schema?

Thanks a lot


Is date the most relevant bit of information for a document? as you say, you can only store one document per date with your given schema. An alternative would be to make a compound key, like: DATE_TIME_DOCUMENT-ID. Document id's could be a sha1 of the contents to ensure uniqueness. And, if you want recent documents to be easily retrievable, you could also invert the DATE-TIME measure (e.g. Long.MAX_VALUE - document timestamp). If you don't care about date, then documents can be stored on their id alone.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号