开发者

UPDATE vs INSERT performance

开发者 https://www.devze.com 2023-04-02 02:59 出处:网络
Am I correct to assume that an UPDATE query takes more r开发者_StackOverflowesources than an INSERT query?I am not a database guru but here my two cents:

Am I correct to assume that an UPDATE query takes more r开发者_StackOverflowesources than an INSERT query?


I am not a database guru but here my two cents:

Personally I don't think you have much to do in this regard, even if INSERT would be faster (all to be proven), can you convert an update in an insert?! Frankly I don't think you can do it all the times.

During an INSERT you don't usually have to use WHERE to identify which row to update but depending on your indices on that table the operation can have some cost.

During an update if you do not change any column included in any indices you could have quick execution, if the where clause is easy and fast enough.

Nothing is written in stones and really I would imagine it depends on whole database setup, indices and so on.

Anyway, found this one as a reference:

Top 84 MySQL Performance Tips


If you plan to perform a large processing (such as rating or billing for a cellular company), this question has a huge impact on system performance.

Performing large scale updates vs making many new tables and index has proven to reduce my company billing process form 26 hours to 1 hour!

I have tried it on 2 million records for 100,000 customer.
I first created the billing table and then every customer summary calls, I updated the billing table with the duration, price, discount.. a total of 10 fields.

In the second option I created 4 phases.
Each phase reads the previous table(s), creates index (after the table insert completed) and using: "insert into from select .." I have created the next table for the next phase.

Summary
Although the second alternative requires much more disk space (all views and temporary tables deleted at the end) there are 3 main advantages to this option:

  1. It was 4 time faster than option 1.
  2. In case there was a problem in the middle of the process I could start the process from the point it failed, as all the tables for the beginning of the phase were ready and the process could restart from this point. If the process fails implementing the first option, you will need to start the all the process all over again.
  3. This made the development and QA work much faster as they could work parallel.


The key resource here is disk access (IOPS to be precise) and we should evaluate which ones results in minimum of that.

Agree with others on how it is impossible to give a generic answer but some thoughts to lead you in the right direction , assume a simple key-value store and key is indexed. Insertion is inserting a new key and update is updating the value of an existing key.

If that is the case (a very common case) , update would be faster than insertion because update involves an indexed lookup and changing an existing value without touching the index. You can assume that is one disk read to get the data and possibly one disk write. On the other hand insertion would involve two disk writes one for index , one for data. But the another hidden cost is the btree node splitting and new node creation which would happen in background while insertion leading to more disk access on average.


You cannot compare an INSERT and an UPDATE in general. Give us an example (with schema definition) and we will explain which one costs more and why. Also, you can compere a concrete INSERT and an UPDATE by checking their plan and execution time.

Some rules of thumbs though:

  • if you only update only one field, which is not indexed and you only update one record and you use rowid/primary key to find that record then this UPDATE will cost less, than
  • an INSERT, which will also affect only one row, though this row will have many not null constrained, indexed fields; and all those indexes have to be maintained (e.g. add a new leaf)


It depends. A simple UPDATE that uses a primary key in the WHERE clause and updates only a single non-indexed field would likely be less costly than an INSERT on the same table. But even that depends on the database engine involved. An UPDATE that involved modifying many indexed fields, however, might be more costly than the INSERT on that table because more index key modifications would be required. An UPDATE with a poorly constructed WHERE clause that required a table scan of millions of records would certainly be more expensive than an INSERT on that table.

These statements can take many forms, but if you limit the discussion to their "basic" forms that involve a single record, then the larger portion of the cost will usually be dedicated to modifying the indexes. Each indexed field that is modified during an UPDATE would typically involve two basic operations (delete the old key and add the new key) whereas the INSERT would require one (add the new key). Of course, a clustered index would then add some other dynamics as would locking issues, transaction isolation, etc. So, ultimately, the comparison between these statements in a general sense is not really possible and would probably require benchmarking of specific statements if it actually mattered.

Typically, though, it makes sense to just use the correct statement and not worry about it since it is usually not an option to choose between an UPDATE and an INSERT.


It depends. If update don't require changes of the key it's most likely that it will only costs like a search and then it will probably cost less than an insert, unless database is organized like an heap.

This is the only think i can state, because performances greatly depends on the database organization used.

If you for example use MyISAM that i suppose organized like an isam, insert should cost generally the same in terms of database read accesses but it will require some additional write operation.


On Sybase / SQL Server an update which impacts a column with a read-only index is internally replaced by a delete and then an insert, so this is obviously slower than insert. I do not know the implementation for other engines but I think this is a common strategy at least when indices are involved. Now for tables without indices ( or for update requests not involving any index ) I suppose there are cases where the update can be faster, depending on the structure of the table.


In mysql you can change your update to insert with ON DUPLICATE KEY UPDATE

INSERT INTO t1 (a,b,c) VALUES (1,2,3)
  ON DUPLICATE KEY UPDATE c=c+1;

UPDATE t1 SET c=c+1 WHERE a=1;


A lot of people here are commenting that you can't compare an insert vs update but I disagree. People should understand that an update takes a lot more resources than insert or even possibly deleting and inserting.

Now regarding how you can even compare the 2 as one doesn't directly replace the other. But in certain cases you make an insert and then update the table with data from another table.

For instance I get a feed from an API which contains id1, but this table relates to another table and I would like to add table2_id. Instead of doing an update statement that takes a lot more resources, I can handle this in the backend which is faster and just do an insert statement instead of an insert and then an update. The update statement also locks the table causing a traffic jam so to speak.

0

精彩评论

暂无评论...
验证码 换一张
取 消