开发者

mysql key/value store problem

开发者 https://www.devze.com 2023-03-21 18:12 出处:网络
I\'m trying to implement a key/value store with mysql I have a user table that has 2 columns, on开发者_如何学JAVAe for the global ID and one for the serialized data.

I'm trying to implement a key/value store with mysql

I have a user table that has 2 columns, on开发者_如何学JAVAe for the global ID and one for the serialized data.

Now the problem is that everytime any bit of the user's data changes, I will have to retrieve the serialized data from the db, alter the data, then reserialize it and throw it back into the db. I have to repeat these steps even if there is a very very small change to any of the user's data (since there's no way to update that cell within the db itself)

Basically i'm looking at what solutions people normally use when faced with this problem?


Maybe you should preprocess your JSON data and insert data as a proper MySQL row separated into fields.

Since your input is JSON, you have various alternatives for converting data:

You mentioned many small changes happen in your case. Where do they occur? Do they happen in a member of a list? A top-level attribute?

If updates occur mainly in list members in a part of your JSON data, then perhaps every member should in fact be represented in a different table as separate rows.

If updates occur in an attribute, then represent it as a field.

I think cost of preprocessing won't hurt in your case.


When this is a problem, people do not use key/value stores, they design a normalized relational database schema to store the data in separate, single-valued columns which can be updated.


To be honest, your solution is using a database as a glorified file system - I would not recommend this approach for application data that is core to your application.

The best way to use a relational database, in my opinion, is to store relational data - tables, columns, primary and foreign keys, data types. There are situations where this doesn't work - for instance, if your data is really a document, or when the data structures aren't known in advance. For those situations, you can either extend the relational model, or migrate to a document or object database.

In your case, I'd see firstly if the serialized data could be modeled as relational data, and whether you even need a database. If so, move to a relational model. If you need a database but can't model the data as a relational set, you could go for a key/value model where you extract your serialized data into individual key/value pairs; this at least means that you can update/add the individual data field, rather than modify the entire document. Key/value is not a natural fit for RDBMSes, but it may be a smaller jump from your current architecture.


when you have a key/value store, assuming your serialized data is JSON,it is effective only when you have memcached along with it, because you don't update the database on the fly every time but instead you update the memcache & then push that to your database in background. so definitely you have to update the entire value but not an individual field in your JSON data like address alone in database. You can update & retrieve data fast from memcached. since there are no complex relations in database it will be fast to push & pull data from database to memcache.


I would continue with what you are doing and create separate tables for the indexable data. This allows you to treat your database as a single data store which is managed easily through most operation groups including updates, backups, restores, clustering, etc.

The only thing you may want to consider is to add ElasticSearch to the mix if you need to perform anything like a like query just for improved search performance.

If space is not an issue for you, I would even make it an insert only database so any changes adds a new record that way you can keep the history. Of course you may want to remove the older records but you can have a background job that would delete the superseded records in a batch in the background. (Mind you what I described is basically Kafka)

There's many alternatives out there now that beats RDBMS in terms of performance. However, they all add extra operational overhead in that it's yet another middleware to maintain.

The way around that if you have a microservices architecture is to keep the middleware as part of your microservice stack. However, you have to deal with transmitting the data across the microservices so you'd still end up with a switch to Kafka underneath it all.

0

精彩评论

暂无评论...
验证码 换一张
取 消