I have an issue with a linq to sql query - performance wise. What i am trying to do, is find out if the elements of my collection of about 500 items (in a List) match db entries. Currently, this operation alone is taking about 300 seconds to complete! The database contains over a million rows and is bound to grow in the future so this level of performance so early on is simply unacceptable. Sample below:
var query = from item in db.DataTable.Where(x => x.date == suppliedDate)
where inputList.Contains(item.name)
select new { item.name};
Help!
edit: Thanks a lot for all your suggstions! i just wanted to add a few additional observations as i've now been able to view the SQL output of my LINQ query (see below)
SELECT [t0].[name]
FROM [dw].[DataTable] AS [t0]
WHERE ([t0].[name] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9, @p10, @p11, @p12, @p13, @p14, @p15, @p16, @p17)) AND ([t0].[date] = @p18)
-- @p0: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [Mark]
-- @p1: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [Owen]
-- @p2: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [James]
-- @p3: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [John]
etc..
Is this making 500 separate hits on the DB? (As Ian suggested)? and is there any way i can imporve performance without having to resort to stored procedures or creating an additional table? (both options not really available to me right now). I've tried Geoff's suggestion and this brought my run time from 300 seconds to about 126 seconds - but that's still a lot especially considering how a db join would take less than 10 seconds at most.
开发者_开发技巧Many Thanks
-- @p0: Input NVarChar (Size = 5; Prec = 0; Scale = 0) [Mark]
Is the column varchar?
If you check the execution plan for that query, you might see sql server converting the whole index to nvarchar (DOOOOOM!)
The fix is to convert the parameters to varchar.
You can get the command and reset the types of the parameters on it directly (to ansi-string in your case).
http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.getcommand.aspx http://msdn.microsoft.com/en-us/library/system.data.dbtype.aspx
Then you might call ExecuteReader on that command, yielding a DbDataReader. You can hand this DbDataReader to the Translate method of your datacontext, and it will give you the IEnumerable<T>
that you'd expect from linq.
http://msdn.microsoft.com/en-us/library/bb534213.aspx
I've posted the code to accomplish this, here
You may be making 500 seperate requests to the database that with that query!
Firstly look at the sql that is running using the query profiler in sql server and see if Linq-to-sql is doing what you expect.
Try
var query = from item in db.DataTable
where item.date == suppliedDate
where inputList.Contains(item.name)
select new { item.name};
You can use LinqPad to test your query and it will also show you what SQL is generated.
This sort of thing will be quite a bit faster done in the database given the number of records you're dealing with.
I'd bulk insert the items in your list into a temp table and then do an inner join or intersect/except as need to get the result.
MSSQL 2008 also has a MERGE sql statement that is super fast that may help too. Doing something like this with over a billion records takes a few milliseconds providing your hardware can cope.
Which brings me to the othe thing :) hardware. Don't under estimate the hardware requirements, especially hard drive speed (typically a RAID 5/6 array comprised of anywhere between 6/12 spindles. for the kind of data you're dealing with if you need really good performance of your queires
A faster way would be to avoid the linq code, you can save the inputList names into a sql table (using xml insert), then write a stored procedure that does the select and returns a dataset. Then In linq you can call that sp and extract the result.
精彩评论