Is it better to run 100 SQL queries with a WHERE clause that has one condition, or one query with a WHERE clause that has 100 conditions? For example, if I was looking to see whether 100 usernames are already present in a table, would it be better to iterate through 100 queries or do one complex query that looks for 10开发者_如何学编程0 usernames at a time.
I haven't done any tests but I would prefer one large query. With 100 queries, you have to connect to the database, send the query string, and handle the response/results 100 times. With one single query, you send one (larger) query, and get one response back. I don't know the exact cost of 100 context switches, but it's probably not insignificant. The database will probably have to do the same amount of work anyway with one large query. And if all you're checking is 1 username vs. 100 usernames, it's not a very complex query it's more like
select *
from users
where username in ('value1',...,'value100');
The fewer queries, the better.
Less is more. Do the fewest amount of queries to achieve your results in this case.
SQL is set based. It is better to run one query with a clause that has 100 conditions. Remember that each query represents a seperate, implicit transaction - thus it requrires more overhead to run a hundred simple queries as opposed to 1 complex query (from a processing and bandwidth perspective). That being said, from an implementation perspective, running the 100 queries may be easier regardless of performance.
IMHO it's personal preference, unless you are doing some mega huge crazy stuff that spans across a few tables etc and on a massive amount of data, the speed gains aren't going to be huge either way.
if you do decide to do it all by queries though, make sure that you set your keys and indexes well, otherwise for each query it has to read in the whole table to find what it needs.
you'll get some HUGE gains in speed if you set your indexes well.
Will it be easier if you create @table, and add 100 names to the @table.
Filter the query by IN statement.
I am willing to bet that one large query is faster...
... but rather than believe my gut feel, there is a simple way to prove it. Built a simple performance prototype and see.
The short answer is that it depends. It depends upon the data in the tables, the types of where clauses that you're running against (primary key vs multiple key, etc), the size of result-sets, etc. In some cases one query will be much faster, in others many will be.
In general, the fewer queries that are executed the better. However with that said, 100 highly efficient queries (primary key lookups for example) are far better than 1 very inefficient query.
As far as your specific problem, if the 100 queries are identical, then combine them into one. If they are related (for example, fetching authors for 100 posts), then combine them into one. If they have nothing to do with each other, don't bother combining them (since it will hurt readability and maintainability).
this is pretty fast but not very pretty :P
drop table if exists users;
create table users
(
user_id int unsigned not null auto_increment primary key,
username varbinary(32) unique not null
)
engine=innodb;
drop procedure if exists username_check;
delimiter #
create procedure username_check
(
in p_username_csv varchar(65535)
)
proc_main:begin
declare v_token varchar(255);
declare v_done tinyint unsigned default 0;
declare v_idx int unsigned default 1;
if p_username_csv is null or length(p_username_csv) <= 0 then
leave proc_main;
end if;
-- split the string into tokens and put into an in-memory table...
create temporary table tmp(
username varbinary(32),
username_exists tinyint unsigned default 0,
key (username)
)engine = memory;
while not v_done do
set v_token = trim(substring(p_username_csv, v_idx,
if(locate(',', p_username_csv, v_idx) > 0,
locate(',', p_username_csv, v_idx) - v_idx, length(p_username_csv))));
if length(v_token) > 0 then
set v_idx = v_idx + length(v_token) + 1;
insert into tmp (username) values(v_token);
else
set v_done = 1;
end if;
end while;
-- flag any that exist
update tmp t inner join users u on t.username = u.username
set t.username_exists = 1;
select * from tmp order by username;
drop temporary table if exists tmp;
end proc_main #
delimiter ;
select count(*) from users;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.22 sec)
call username_check('user1,user2,user3... user250');
250 rows in set (0.01 sec)
精彩评论