开发者

Rails compressing data before saving to database

开发者 https://www.devze.com 2023-03-26 15:32 出处:网络
I need to store large amount of data into the database (MySQL). I would like to save disk space by compressing the text data before storing it to the database.

I need to store large amount of data into the database (MySQL). I would like to save disk space by compressing the text data before storing it to the database.

I know there will be a performance hit for compressing/decompressing data. But I am going to cache the decompressed data on CDN. And mostly, the data will not become stale for months or even years.

Can you please refer me some good compression/decompr开发者_StackOverflow社区ession techniques? I am also open to other alternatives than compressing/decompressing data.


If you want a pure MySQL solution, you could always try using the ARCHIVE storage type for your table. The documentation describes it as an insert-only, no update type of engine meant specifically for what you describe, stashing away things that won't change for years.

To do the same thing in a conventional engine would require using zlib on your data streams, but remember that compression performs very poorly on already compressed data such as most popular image types or video. You express your requirements as mostly text, which usually compresses quite well.

Ruby has Zlib::Deflate which can compress and expand data on demand. You could write your own wrapper similar to the JSON one by implementing the encode and decode methods on your module.

One thing to consider is you can probably store the compressed data on your CDN so long as you can be sure your client supports gzip encoding. I don't know of any major browsers that don't, as asset compression has become quite standard, especially in the mobile space.


If the data is really as static as you say then save the data as zipped xml files.

You can unzip them and zip them up again really easily as and when needed and in Rails generating an XML file is dead simple using SomeModel.to_xml the output of which can easily be sent to a file so maintaining them will be a simple too. You can just as eaily work this the other way round so that when it comes to reading the data in you can simply convert the data back into a model (Rails 3.x has ActiveModel which would be ideal for this scenario as the data is not backed by a database but you still get a ActiveRecord API and all the juice that AR gives you meaning that your views, controllers etc are working with a consistent api and consistent behaviour.

You have other options as well such as using ActiveResource but I wouldn't think that was necessary. Not a recommended approach if you were not to be caching the data in the way you suggest (which is a neat solution BTW)

0

精彩评论

暂无评论...
验证码 换一张
取 消