I'm having MySQL timeout issues with a query i'm running. It's a simple query, but even in a MySQL editor it takes 5 minutes or so to complete. I'm hoping you guys might know a better method to fix this timeout issue.
开发者_开发问答 string processedCONString = "SERVER=localhost;" +
"DATABASE=discovery;" +
"UID=;" +
"PASSWORD=;"+
"connection timeout=500000";
MySqlConnection processCON = new MySqlConnection(processedCONString);
string mySQLCOMMAND = "update "+ siteString+"_discovery "+
"set processed = b'0' "
+"WHERE URL not in (select URL from live)";
MySqlCommand mysqlprocessCmdInsertItem = new MySqlCommand(mySQLCOMMAND, processCON);
processCON.Open();
mysqlprocessCmdInsertItem.ExecuteNonQuery();
processCON.Close();
Yes the UID and Password are left blank here but not in code.
Also, as this database grows the query is going to take longer and longer.
Try indexing the URL column in the table returned by siteString + "_discovery ".
Update:
Also be aware that the order you specify statements in your where clauses and INNER JOIN s are very important. You want to be conscious of when your statement is going to cause an operation to be performed for each row or if it will be done ahead of time and then applied to the rows. There are several rules for this that have been well documented online. The other answers offer some good advice here. In addition, when I worked at a large agency, I always ran my SQL scripts past our dba who would then strongly chastise me and complain about how software developers are always breaking his m*** f*** database. If you have someone like that, they are usually a great help as they have all of these rules memorized and we do not.
Google: "sql query best practices" and you will find a plethora of information. Here is one link,
http://blog.sqlauthority.com/2009/01/20/sql-server-rules-for-optimizining-any-query-best-practices-for-query-optimization/
Jonathan Henson answer is a good option +1 to that.
If that is still not enough you could try processing it by parts. Imagine you have an ID, you could place your code inside a cicle and process 1000 (or the number you find appropriate) items in each iteration.
Maybe I'm reaching but try this sql...
"UPDATE " + siteString + "_discovery as d " +
" SET d.processed = b'0' " +
" WHERE d.URL IN (SELECT URL from live where URL = d.URL) ";
Rather than evaluate the entire select in the subquery, filter it so that your not returning every row while trying to evaluate the where condition in the parent statement.
Although, the subquery is going to run for every single row that returns so we should probably remove the subquery altogether and let a join take care of this. Since we only want to do this for URL's that we can find in the LIVE table, joining onto them should be cake to flatten out the subquery...
"UPDATE d " +
" SET d.processed = b'0' " +
" FROM " + siteString + "_discovery as d " +
" JOIN live as l " +
" ON d.URL = l.URL";
This particular update was tested on SQL Server, not MySql but you may have to tweak the order of events to convert this, I'm not 100% sure, can anyone confirm?
If you're annoyed by the exception, then
mysqlprocessCmdInsertItem.CommandTimeout=1000;
might help (or another big number for timeout). Command timeout 30 seconds default (only time spent in network read counts).
I know this thread is ancient, and Mohgeroth was on the right track, but for mysql, the query would look like this...
update siteX_discovery
left join live on live.URL = siteX_discovery.URL
set processed = b'0'
where live.URL is null
Of course, as Jonathan pointed out, you need the index or it will be slow no matter what you do.
All by itself, the index will be good enough for quite a while, but if your table gets large, eliminating that subquery will make a noticeable difference.
精彩评论