Will inserting half a million entries with the same date value be slowed by a non-unique index on that date column?_问答_开发者

Will inserting half a million entries with the same date value be slowed by a non-unique index on that date column?

开发者 https://www.devze.com 2023-01-15 14:46 出处：网络

I have a cursor that selects all rows in a table, a little over 500,000 rows. Read a row from cursor, INSERT into other table, which has two indexes, neither unique, one numeric, one \'DATE\' type. CO

I have a cursor that selects all rows in a table, a little over 500,000 rows. Read a row from cursor, INSERT into other table, which has two indexes, neither unique, one numeric, one 'DATE' type. COMMIT. Read next row from Cursor, INSERT...until Cursor is empty.

All my DATE column's values are the same, from a timestamp initialized at the start of the script.

This thing's been running for 24 hours, only posted 464K rows, a little less than 10K rows / hr.

Oracle 11g, 10 processors(!?) Something has to be wrong. I think it's that开发者_如何学JAVA DATE index trying to process all these entries with exactly the same value for that column.

Why don't you just do:

insert into target (columns....) 
select columns and computed values 
from source

commit

This slow by slow is doing far more damage to performance than an index that may not make any sense.

Indexes slow down inserts but speed up queries. This is normal.

If it is a problem you can remove the index, insert the rows, then add the index again. This can be faster if you are doing many inserts at once.

The way you are copying the data using cursors seems to be inefficient. You could try a set-based approach instead:

INSERT INTO table1 (x, y, z)
SELECT x, y, z FROM table2 WHERE ...

Committing after every inserted row doesn't make much sense. If you're worried about exceeding undo capacity, for example, you can keep a count of the inserts and issue a commit after every thousand rows.

Updating the indexes will have some impact but that's unavoidable if you can't drop (or disable) while the inserts are performed, but that's just how it goes. I'd expect the commits to have a bigger impact, though I suspect that's a topic with varied opinions.

This assumes you have a good reason for inserting from a cursor rather than as a direct insert into ... select from model.

In general, its often a good idea to delete the indexes before doing a massive insert and then add them back afterwards, so that the db doesnt have to try to update the indexes with each insert. Its been a long while since I've used oracle, but had you tried putting more than one insert statement in a transaction? That should also speed it up.

For operations like this you should look at oracle bulk operations, using FORALL and BULK COLLECT. It will reduce the number of DDL operations on the underlying tables considerably

create or replace procedure fast_proc is
    type MyTable is table of source_table%ROWTYPE;
    MyTable table;
    begin
        select * BULK COLLECT INTO table from source_table;

         forall x in table.First..table.Last
             insert into dest_table values table(x) ;
    end;

Agreed on comment that what is killing your time is the 'slow by slow' processing. Copying 500,000 rows should be a matter of minutes.

The single INSERT ... SELECT FROM .... approach would be the best one, provided you have big enough Rollback segments. The database may even automatically apply parallel techniques to a plain SQL statement that it will not do with PL/SQL.

In addition you could look at using the /*+ APPEND */ hint - read up on it and see if it may apply to the situation with your target table.

o use all 10 cores you will need to either use plain parallel SQL, or run 10 copies of your pl/sql block, splitting the source table across the 10 copies.

In Oracle 10 this is a manual task (roll your own parallelism) but Oracle 11.2 introduces DBMS_PARALLEL_EXECUTE.

Failing that, bulking up your fetch / insert using the BULK COLLECT & bulk insert would be the next best option - process in chunks of 1000 or so rows (or larger). Again take a look as to whether DBMS_PARALLEL_EXECUTE may help you, or if you could submit the job in chunks via DBMS_JOB.

(Caveat : I don't have access to anything later than Oracle 10)