开发者

How can I get a COUNT(col) ... GROUP BY to use an index?

开发者 https://www.devze.com 2022-12-28 19:41 出处:网络
I\'ve got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:

I've got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:

 SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1

Unfortunately, this is resulting in a full table scan of the table, which takes upwards of a minute. Is there any way of getting oracle to use the index on the columns to return the results much faster?

EDIT:

more specifical开发者_StackOverflow中文版ly, I'm running the following query:

SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner

and there is an index on SYS.OBJ$ (SYS.I_OBJ2) which indexes the owner# and name columns; I believe I should be able to use this index in the query, rather than a full table scan of SYS.OBJ$


I have had the chance to play around with this, and my previous comments regarding the NOT IN are a red herring in this case. The key thing is the presence of NULLs, or rather whether the indexed columns have NOT NULL constraints enforced.

This is going to depend on the version of the database you're using, because the optimizer gets smarter with each release. I'm using 11gR1 and the optimizer used the index in all cases except one: when both columns were null and I didn't include the NOT IN clause:

SQL> desc big_table
 Name                                  Null?    Type
 -----------------------------------  ------    -------------------
 ID                                             NUMBER
 COL1                                           NUMBER
 COL2                                           VARCHAR2(30 CHAR)
 COL3                                           DATE
 COL4                                           NUMBER

Without the NOT IN clause...

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      group by col4
  4  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399

----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 31964 |   280K|       |  7574   (2)| 00:01:31 |
|   1 |  HASH GROUP BY     |           | 31964 |   280K|    45M|  7574   (2)| 00:01:31 |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  2340K|    20M|       |  4284   (1)| 00:00:52 |
----------------------------------------------------------------------------------------

9 rows selected.


SQL>

When I dobbed the NOT IN clause back in, the optimizer opted to use the index. Weird.

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      where col1 not in (12, 19)
  4      group by col4
  5  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376

----------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        | 31964 |   280K|       |  5057   (3)| 00:01:01 |
|   1 |  HASH GROUP BY        |        | 31964 |   280K|    45M|  5057   (3)| 00:01:01 |
|*  2 |   INDEX FAST FULL SCAN| BIG_I2 |  2340K|    20M|       |  1767   (2)| 00:00:22 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------

   2 - filter("COL1"<>12 AND "COL1"<>19)

14 rows selected.

SQL>

Just to repeat, in all other cases, as long as one of the indexed columns was declared not nill, the index was used to satisfy the query. This may not be true on earlier versions of Oracle, but it probably points the way forward.


you could use a hint http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm , but remember that using an index might not always result in faster execution.


(Just in case, are you sure it's doing a table scan and not an index scan?)

Try using COUNT(*) instead of COUNT(col2) (assuming this is appropriate for you problem, of course). Also, maybe try an index with just col1.


You are querying against oracle's fixed tables, since you've not stated which db vesion this is, I'll assume a recent one. Have the fixed tables been analyzed and have updated statistics? Have you tried your query using the rule base optimizer by the use of the /*+ rule */ hint. Often I've seen that queries against oracle's own fixed tables perform better when the rule base optimizer is used.

0

精彩评论

暂无评论...
验证码 换一张
取 消