In an application I need to query a Postgres DB where I expect tens or even hundreds of millions of rows in the result set. I might do this query once a day, or even more frequently. The query itself is relatively simple, although may involve a few JOINs.
My question is: How smart is Postgres with respect to avoiding having to seek around the disk for each row of the result set? Given the time re开发者_运维百科quired for a hard disk seek, this could be extremely expensive.
If this isn't an issue, how does Postgres avoid it? How does it know how to lay out data on the disk such that it can be streamed out in an efficient manner in response to this query?
When PostgreSQL analyzes your data, one of the statistics calculated, and used by the query planner is the correlation between the ordering of values in your field or index, and the order on disk.
Statistical correlation between physical row ordering and logical ordering of the column values. This ranges from -1 to +1. When the value is near -1 or +1, an index scan on the column will be estimated to be cheaper than when it is near zero, due to reduction of random access to the disk. (This column is NULL if the column data type does not have a < operator.)
The index cost estimation functions also calculate a correlation:
The indexCorrelation should be set to the correlation (ranging between -1.0 and 1.0) between the index order and the table order. This is used to adjust the estimate for the cost of fetching rows from the parent table.
I don't know for sure, but I assume that the correlation values from various possible plans are used by the planner when determining whether the number of rows to be read from a table can be done with lower cost by performing a table scan, with sequential io (possibly joining in with another concurrent scan of the same table), filtering for the required rows, or an index scan, with its resulting seeks.
PostgreSQL doesn't keep tables sorted according to any particular key, but they can periodically be recreated in a particular index order using the CLUSTER command (which will be slow, with a disk seek per row, if the data to cluster has low correlation to the index values order).
PostgreSQL is able to effectively collect a set of disk blocks that need retrieving, then obtain them in physical order to reduce seeking. It does this through Bitmap Scans. Release Notes for 8.1 say:
Bitmap scans are useful even with a single index, as they reduce the amount of random access needed; a bitmap index scan is efficient for retrieving fairly large fractions of the complete table, whereas plain index scans are not.
Edit: I meant to mention the planner cost contants seq_page_cost
and random_page_cost
that inform the planner of the relative costs of performing a disk page fetch that is part of a series of sequential fetches, vs. a non-sequentially-fetched disk page.
精彩评论