Oracle: speeding up count(*)?_问答_开发者_运维开发者技术经验分享

I've got an application being put together with cake/php. It's pretty nice, but their data pager does this:

SELECT COUNT(*) AS COUNT
FROM foo f
LEFT JOIN bar b 
ON (f.asset_group_id = b.asset_group_id)
WHERE 1                    = 1

Any way possible to speed this up?

update: table definitions (extra columns removed):

开发者_开发问答create table bar (
      last_modified_by varchar2(16), 
      asset_group_id number(10,0) not null enable, 
      folder varchar2(512) not null enable, 
      name varchar2(512) not null enable, 
      kind varchar2(16),
      -- exta fields deleted
       constraint bar_pk primary key (folder, name) enable
 );

create index bar_last_modified_date on bar (last_modified_date desc) ;
create index bar_asset_group_id on bar (asset_group_id desc) ;
create index bar_folder on bar (folder) ;
create index bar_kind on bar (kind) ;
create unique index bar_pk on bar (folder, name) ;

create table foo (
      created_date date not null enable, 
      asset_group_id number(10,0) not null enable, 
      keyword varchar2(4000) not null enable, 
      -- exta fields deleted
      constraint foo_pk primary key (asset_group_id, keyword) enable
)  enable row movement ;

create index foo_created on foo (created_date desc) ;
create unique index foo_pk on foo (asset_group_id, keyword) ;

If you don't have indexes on both asset_group_id columns it's time to build some.

A materialized view might help. Then again, this might slow down inserts into the underlying tables, which could be an issue if there are lots of inserts.

You can use a fast refresh materialized view nested on another fast refresh materialized view .

(I insert sample data)

create table bar (
      last_modified_by varchar2(16), 
      asset_group_id number(10,0) not null enable, 
      folder varchar2(512) not null enable, 
      name varchar2(512) not null enable, 
      -- exta fields deleted
       constraint bar_pk primary key (folder, name) enable
 );

create index bar_asset_group_id on bar (asset_group_id desc) ;
create index bar_folder on bar (folder) ;
create index bar_kind on bar (kind) ;

insert into bar values ('deew',1,'A','B');
insert into bar values ('deew',1,'A','C');
insert into bar values ('deew',1,'B','C');
insert into bar values ('deew',2,'E','C');

commit;

create table foo (
      created_date date not null enable, 
      asset_group_id number(10,0) not null enable, 
      keyword varchar2(4000) not null enable, 
      -- exta fields deleted
      constraint foo_pk primary key (asset_group_id, keyword) enable
)  enable row movement ;


insert into foo values (sysdate,1,'dd');
insert into foo values (sysdate,2,'dd');
insert into foo values (sysdate,3,'dd');
insert into foo values (sysdate,1,'ddE');

commit;

create index foo_created on foo (created_date desc) ;

create materialized view log on bar with rowid including new values;
create materialized view log on foo with rowid including new values;

create materialized view foobar_count_helper 
refresh fast on commit
as
select f.rowid rowid_f
,      b.rowid rowid_b
from   foo f
,      bar b
where  f.asset_group_id = b.asset_group_id (+)
/

create materialized view log on foobar_count_helper with rowid including new values; 

create materialized view foobar_count
refresh fast on commit
as
select count(*) count
from   foobar_count_helper
/

Test results:

SQL> 
SQL> 
SQL> select * from foobar_count;

     COUNT
----------
         8

SQL> 
SQL> SELECT COUNT(*) AS COUNT
  2  FROM foo f
  3  LEFT JOIN bar b
  4  ON (f.asset_group_id = b.asset_group_id)
  5  where 1=1
  6  /

     COUNT
----------
         8

There's not really any way you could modify the SQL to make it faster - it's already a rather basic query. You might be able to modify aspects of the tables themselves (indices, et cetera), but the SQL itself is about the most efficient representation you could get of the information desired.

Make sure that f.asset_group_id and b.asset_group_id are indexed.

I believe that internally count(*) and count(1) do the same fetch (none) in Oracle.

Regards
K

ps, If you've got the enterprise version you can create a bitmap join index on f.asset_group_id,b.asset_group_id for some real lookup speed :-)

You do have indexes involving the asset group ids, but one is descending and the other one also consists of the keyword column. Let's ask the Oracle query planner if it's using these or not in your query:

EXPLAIN SELECT COUNT(*) ...

If you could post these results, that would give us a lot of information.

I would expect these indexes to work if your current ones aren't used:

create index bar_asset_group_id on bar (asset_group_id);
create index foo_asset_group_id on foo (asset_group_id);

Rename the current bar_asset_group_id to bar_asset_group_id_desc for clarity.

And delete WHERE 1 = 1: it's (almost entirely) harmless, but very unnecessary.

Depends on the data, the correlation between foo.asset_group_id and bar.asset_group_id. You need to check the plan to see which table is driving the query, what join method it is using. For example, if one of the tables was an order of magnitude smaller than the other, a hash join might be preferable instead of nested loops + index lookups. Another option (esp. if both tables are quite large) is a sort merge, assisted with suitable indexes on both asset_group_id columns.

If foo and bar are indexed and both tables have up to date statistics, then count(*) will usually return the row count in the most efficient way.

Perhaps it's just a big table, in which case getting a full up-to-date row count before paging through the table content is just going to be expensive.

A couple of ways around this:

Don't worry about how many pages of results there are - "page 1 of 3439" isn't a lot more useful than "page 1 of lots".
Pre-calculate the results and update them as regularly as you need to. For example, if the row count is 24 hours old would that be ok?

Really need more information about the design of the tables before providing any useful answer. The query includes an outer join and if asset_group_id is the primary key of bar then the join isn't even necessary.... but there is no way of knowing without table definitions.

A way of this speeding up, try replacing Count(*) with Count(ColumnNo1, ColumnNo2,...), the asterisk used in it could be the slowdown there as the database has to work out each time what does the asterisk mean...by including specific column names in the sql, you are effectively relieving the database in having to work out what are the columns used in that table.

Hope this helps, Best regards, Tom.