I'm having strange and different results for explain plans on Postgresql. Postgresql server is installed on a VMWare machine and when execu开发者_如何学JAVAting several explain plans for a given SQL query, different results are returned. It seems to me that the hardware virtualization may provide "erroneous" information to Postgresql server so that it returns "anormal and somewhat random" costs measurements. Am I right or is there any other explanation for those surprising and strange results?
In any case, if you know any, I'd appreciate any helpful docs.
VACUUM should be a regular part of database operations. It's probably not the source of your problem, though.
We recommend that active production databases be vacuumed frequently (at least nightly), in order to remove dead rows. After adding or deleting a large number of rows, it might be a good idea to issue a VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the PostgreSQL query planner to make better choices in planning queries.
The FULL option is not recommended for routine use, but might be useful in special cases. An example is when you have deleted or updated most of the rows in a table and would like the table to physically shrink to occupy less disk space and allow faster table scans. VACUUM FULL will usually shrink the table more than a plain VACUUM would.
Since you got different costs on consecutive executions, one under VMWare and one without, with no database changes, I'd say virtualization is having some effect. I'm pretty sure that a virtual machine would appear to have less RAM than straight hardware, but I don't have a way to test that, or to test its effect on the query optimizer right now.
Plans are not directly affected by the underlying hardware. They are a product of cost parameters (e.g., random_page_cost
), memory settings (e.g., work_mem
), and table statistics. You ought to be able to verify easily whether the parameter settings are the same between your two instances. (Ideally one would make adjustments to these parameters based on hardware characteristics, and then you would get different plans on different systems.) The table statistics are dependent on the actual data in the tables (you did not mention whether you had the same data in both instances) and some random element when the statistics are collected. Make sure ANALYZE
has been run. If you're still clueless, post the plans.
精彩评论