开发者

How would you model this in MongoDB?

开发者 https://www.devze.com 2023-01-16 15:35 出处:网络
There are products with a name and price. Users log about products they have bought. # option 1: embed logs

There are products with a name and price.

Users log about products they have bought.

# option 1: embed logs
product = { id, name, price }
user = { id, 
         name,
         logs : [{ product_id_1, quantity, datetime, comment },
                 { product_id_2, quantity, datetime,开发者_StackOverflow社区 comment },
                 ... ,
                 { product_id_n, quantity, datetime, comment }] 
}

I like this. But if product ids are 12 bytes long, quantity and datetime are 32-bit (4 bytes) integers and comments 100 bytes on average, then the size of one log is 12+4+4+100 = 120 bytes. The maximum size of a document is 4MB, so maximum amount of logs per user is 4MB/120bytes = 33,333. If assumed that a user logs 10 purchases per day, then the 4MB limit is reached in 33,333/10 = 3,333 days ~ 9 years. Well, 9 years is probably fine, but what if we needed to store even more data? What if the user logs 100 purchases per day?

What is the other option here? Do I have to normalize this fully?

# option 2: normalized
product = { id, name, price }
log = { id, user_id, product_id, quantity, datetime, comment }
user = { id, name }

Meh. We are back to relational.


if the size is the main concern, you can go ahead with option 2 with mongo DbRef.

     logs : [{ product_id_1, quantity, datetime, comment },
             { product_id_2, quantity, datetime, comment },
             ... ,
             { product_id_n, quantity, datetime, comment }] 

and embed this logs inside user using Dbref, something like

       var log = {product_id: "xxx", quantity:"2", comment:"something"}
       db.logs.save(log)
       var user= { id:"xx" name : 'Joe', logs : [ new DBRef('logs ', log._id) ] }
       db.users.save(user)


Yes, option 2 is your best bet. Yes, you're back to a relational model, but then, your data is best modeled that way. I don't see a particular downside to option 2, its your data that is requiring you to go that way, not a bad design process.

0

精彩评论

暂无评论...
验证码 换一张
取 消