sorry if the title is not very clear. I will try to explain now:
I have two tables: table A and table B. The relation between them is one (for A table) to many (for B table). So, it's something like master-detail situation. I have a column 'Amount' in table B tha开发者_C百科t is obviously decimal and a column 'TotalAmount' in table A. I'm trying ti figure out how to keep the value in table A up to date. My suggestion is to make a view based on table A with aggregate query counting the Amounts from table B. Of course with the proper indexes ... But, my team-mate suggest to update the value in table A every time we change something in table B from our application. I wonder, what will be the best solution here? May be a third variant?
Some clarification ... We expected this tables to be the fastest growing tables in our database. And The table B will grow much much faster than table A. The most frequent operation in table B will be insert ... and almost nothing else. The most frequent operation in table A will be select ... but not only.
I see a number of options:
- Use an insert trigger on table B and do the updates table A as your friend suggests. This will keep table B as up to date as possible.
- Have a scheduled job that updates table A every x minutes (x = whatever makes sense for your application).
- When updating table B, do an update on table A in your application logic. This may not work out if you update table B in many places.
If you have a single place in your app where you insert new rows to table B, then the most simple solution is to send an UPDATE A set TotalAmount=TotalAmount + ? where ID = ?
and pass the values you just used to insert into B. Make sure you wrap both queries (the insert and the update) in a transaction so either both happen or none.
If that's not simple, then your next option is a database trigger. Read the docs for your database how to create them. Basically a trigger is a small piece of code that gets executed when something happens in the DB (in your case, when someone inserts data in table B).
The view is another option but it may cause performance problems during selects which you'll find hard to resolve. Try a "materialized view" or a "computed column" instead (but these can cause performance problems when you insert/remove columns).
If this value is going to change a lot, you're better off using a view: It is definitely the safer implementation. But even better would be using triggers (if your database supports them.)
I would guess your mate suggests updating the value on each insert because he thinks that you will need the value quite often and it might lead to a slow-down recalculating the value each time. If that is the case:
- Your database should take care of the caching, so this probably won't be an issue.
- If it is, nonetheless, you can add that feature at a later stage - this way you can make sure your application works otherwise and will have a much easier time debugging that cache column.
I would definitely recommend using a trigger over using application logic, as that ensures that the database keeps the value up-to-date, rather than relying on all callers. However, from a design point of view, I would be wary of storing generated data in the same table as non-generated data -- I believe it's important to maintain a clear separation, so people don't get confused between what data they should be maintaining and what will be maintained for them.
However, in general, prefer views to triggers -- that way you don't have to worry about maintaining the value at all. Profile to determine whether performance is an issue. In Postgres, I believe you could even create an index on the computed values, so the database wouldn't have to look at the detail table.
The third way, recalculating periodically, will be much slower than triggers and probably slower than a view. That it's not appropriate for your use anyway is the icing on the cake :).
精彩评论