Indexing affects only the WHERE clause?_问答_开发者

开发者 https://www.devze.com 2023-02-02 22:07 出处：网络

If I have something like: CREATE INDEXidx_myTable_field_x ONmyTable USINGbtree (field_x); SELECT COUNT(field_x), field_x FROM myTable G开发者_如何学GoROUP BY field_x ORDER BY field_x;

If I have something like:

CREATE INDEX   idx_myTable_field_x
ON             myTable
USING          btree (field_x);

SELECT COUNT(field_x), field_x FROM myTable G开发者_如何学GoROUP BY field_x ORDER BY field_x;

Imagine myTable with around 500,000 rows and most of field_x values being unique.

Since I don't use any WHERE clause, will the created index have any effect at all in my query?

Edit: I'm asking this question because I don't get any relevant difference between query-times before and after creating the index; They always take about 8 seconds (which, of course is too much time!). Is this behaviour expected?

The index will not help here as you are reading the whole table anyway there is no use in going to an index first (PostgreSQL does not yet have index-only scans)

Because nearly all values in the index are unique, it wouldn't really help in this situation anyway. Index lookups (including index-scans for other DBMS) tend to be really helpful for lookup of a small number of rows.

There is a slight possibility that the index might be used for ordering but I doubt that.

If you look at the output of EXPLAIN ANALYZE VERBOSE you can see if the sorting is done in memory or (due to the size of the result) is done on disk.

If sorting is done on disk, you can speed up the query by increasing the work_mem - either globally or just for your session.

Since field_x is the only column referenced in your query, your index covers the query and should help you avoid lookups into actual rows of myTable.

EDIT: As indicated in the comment discussion below, while this answer is valid for most RDBMS implementations, it does not apply to postgresql.

The index should be used. If you ever want to see how your indexes are being used (or not), the execution plan of the query is a great place to see what the database has decided to do. In your case you should execute something like:

explain SELECT COUNT(field_x), field_x FROM myTable GROUP BY field_x ORDER BY field_x;

More information about what all the output you are seeing means can be found in the postgres docs: http://www.postgresql.org/docs/8.4/static/sql-explain.html

There is also: http://wiki.postgresql.org/wiki/Image:Explaining_EXPLAIN.pdf which is a bit more in depth.