I have an idea I have yet to implement, because I have some fear I may be barking up the wrong tree... mainly because Googling on the topic returns so few results.
Basically I have some SQL queries that are slow, in large part because they have subqueries that are time-consuming. For example, they might do things like "give me a count of all bicycles that are red and ridden by boys between the ages of 10-15". This is expensive as it sloshes through all of the bicycles, but the end result is a single number. And, in my case, I don't really need that number to be 100% up to date.
The ultimate solution for problems of this sort seems to be to apply an OLAP-based engine to pre-cache these permutations. However, in my case I'm not really trying to slice and dice the data around a ton of metrics, and I'd love not to have to complicate my architecture with yet another process/datastore running.
So... my idea was basically memoizing these subqueries in the database. I might have a table called "BicycleStatistics" and it might store the output of that subquery above as a name value pair of it's inputs and outputs.
Ex name: "c_red_g_male_a_10-15" value: 235
And have a mechanism that memoizes those values to that table as the queries are run.
Has anyone been in this situation and tried anything similar? The reason I think a solution like this is valuable over the "throw a lot of RAM in your DB and let the database handle it" is (A) my database is bigger than the amount of RAM I can conveniently throw at it, and (B) the database is going to ensure I get the exact right number fo开发者_如何转开发r these statistics, and my big win, above, is that I'm ok with the numbers being a day or two out of date.
Thanks for any thoughts/feedback.
Tom
Materialized views are a way of achieving this requirement, if your DBMS supports them.
精彩评论