I have been thinking about the开发者_C百科 best way to calculate the amount of on-hand units for a given inventory item. The tables will be very large, with many transactions so it may not be realistic to calculate on-hand on demand by adding all receivings and subtracting all shipments and sales. One idea I have come across is to periodically create checkpoints, i.e. records of item and and on-hand quantity. Any subsequent calculations of on-hand quantity start from the checkpoint forward, negating the need to sum over all transactions.
inventory item table
id | location | item
1 1 234
2 1 567
receiving table
inv item | stamp | quantity
1 2010-08-10 200
1 2010-08-30 10
2 2010-08-30 20
shipment table
inv item | stamp | quantity
1 2010-08-10 40
1 2010-08-30 5
2 2010-08-30 2
sale table
inv item | stamp | quantity
1 2010-08-10 10
1 2010-08-15 15
1 2010-08-30 1
1 2010-08-30 1
2 2010-08-30 2
checkpoint table
inv item | stamp | quantity
1 2010-08-11 150
1 2010-08-28 135
2 2010-08-15 15
Calculating on hand for inv item 1 on 8/30 would be like this
get most recent checkpoint(inv item 1, 8/30) returns 135 units on 8/28
on-hand = 135 + receivings - (shipments + sales)
only rcv/shp/sales that occur between the most recent checkpoint 8/30
Calculating on hand for inv item 1 on 8/14 would be like this
get most recent checkpoint(inv item 1, 8/14) returns 150 units on 8/11
on-hand = 150 + receivings - (shipments + sales)
only rcv/shp/sales that occur between the most recent checkpoint and 8/14
My question is, what sort of problems does this approach pose? Are there any better approaches for dealing with enormous transaction sets other than storing the on-hand quantity in a table? Is storing the on-hand quantity in a table almost the same as the checkpoint method, perhaps even less prone to error if updated via triggers or some sort of signals?
Instead of "checkpoints", I'd look at building summary tables in which you summarize the on hand inventory values (and any other values you need) at some interval. This interval could be once per day, and you could run the long running process to build the summary table in a nightly batch job, so that your users wouldn't be affected (assuming your users are using this data during daytime hours).
In the case where you need the data refreshed more frequently, you could refresh the summary tables multiple times per day with the same batch job. However, to minimize data unavailability, you could actually have 2 sets of summary tables, one set that you load with the queries (which take a long time to load since the queries run a long time), and the "real" set that you report off of. But you just load the tables from the first summary table to the second summary table once the queries finish (a very fast operation since you're just dumping the data) so the downtime is minimal. So altogether, the process would look like this:
- Load table
summary1
with the inventory data. This is where you run the queries to do the "heavy lifting". - Once the queries finish, load table
summary2
fromsummary1
. This would be the only downtime your users would experience since even during the time thatsummary1
was being loaded, they'd still be reporting off ofsummary2
, which would be unaffected.
The overall approach you've described seems to be fine, although you'll want to focus on what can happen with edge cases. As just one example, think about what would happen in the following scenario:
- Item "A" has an on-hand quantity of 1.
- Customer orders 1 unit of "A". On-hand quantity is now 0. (checkpoint of 1 minus 1 sale)
- Checkpoint is determined. Quantity of "A" is now fixed at 0 units.
- Item "A" can not be shipped to customer for some reason (customer's payment method was declined, customer cancels order, etc.)
- Item "A" now needs to be reverted to show an on-hand quantity of 1. Removing the "sale" record won't do anything, since it was logged before the checkpoint.
In this case there's a couple of solutions that could work (modify the checkpoint value, or better yet, post a "revert" record so there's an audit trail), but that's just one example of things you should be thinking through with this approach.
精彩评论