开发者

Pattern for managing large user uploaded datasets?

开发者 https://www.devze.com 2023-01-29 13:53 出处:网络
I\'m a relatively noob programmer.I am creating web based GIS tool where users can upload custom datasets ranging from 10 rows to 1million.The datasets can have variable columns and datatypes.How do y

I'm a relatively noob programmer. I am creating web based GIS tool where users can upload custom datasets ranging from 10 rows to 1million. The datasets can have variable columns and datatypes. How do you manage these user submitted datasets?

Is the creation of a table per dataset a bad idea? (BTW - i'll be using postgresql as the database).

My apologies if this is already answered somewhere, but开发者_开发百科 my search did not turn up any good results. I may be using bad keywords in my search.

Thanks!


creating a table per dataset is not a 'bad' idea at all. swivel.com was a very similar app to what you are describing and we used table per dataset and it worked very well for graph generation on user uploaded datasets and comparing data across datasets using joins. we had over 10k datasets and close to a million graphs and some datasets were very large.

you also get lots of free usage out of your orm layer, for instance we could use active record for working with a dataset (each dataset is a generated model class with its table set to the actual table)

pitfall wise is you gotta do a LOT of joins if you have any kind of cross dataset calculations.


My coworkers and I recently tackled a similar problem where we had a poor data model in MySQL and were looking for better ways to implement it. We weighed a few different options, including MongoDB, and ended up using the entity attribute value model. The EAV model is essentially a 3-column model. It allowed us to a single model to represent a variable number of columns and data types.

You can read a little about our problem here, but it sounds like it might be a good fit for you too.

0

精彩评论

暂无评论...
验证码 换一张
取 消