I got a very simple LINQ query:
List<table> list = ( from t in ctx.table
where
t.test == someString
&& t.date >= dateStartInt
&& t.date <= dateEndInt
select t ).ToList<table>();
The table which gets queried has got about 30 million rows, but the columns test
and date
are indexed.
When it should return around 5000 rows it takes several minutes to complete.
I also checked the SQL command which LINQ generates. If I run tha开发者_C百科t command on the SQL Server it takes 2 seconds to complete.
What's the problem with LINQ here? It's just a very simple query without any joins.
That's the query SQL Profiler shows:
exec sp_executesql N'SELECT [t0].[test]
FROM [dbo].[table] AS [t0]
WHERE ([t0].[test] IN (@p0)) AND ([t0].[date] >= @p1)
AND ([t0].[date] <= @p2)',
N'@p0 nvarchar(12),@p1 int,@p2 int',@p0=N'123test',@p1=110801,@p2=110804
EDIT:
It's really weird. While testing I noticed that it's much faster now. The LINQ query now takes 3 seconds for around 20000 rows, which is quite ok.
What's even more confusing: It's the same behaviour on our production server. An hour ago it was really slow, now it's fast again. As I was testing on the development server, I didn't change anything on the production server. The only thing I can think of being a problem is that both servers are virtualized and share the SAN with lots of other servers.
How can I find out if that's the problem?
Before blaming LINQ first find out where the actual delay is taking place.
- Sending the query
- Execution of the query
- Receiving the results
- Transforming the results into local types
- Binding/showing the result in the UI
- And any other events tied to this process
Then start blaming LINQ ;)
If I had to guess, I would say "parameter sniffing" is likely, i.e. it has built and cached a query plan based on one set of parameters, which is very suboptimal for your current parameter values. You can tackle this with OPTION (OPTIMIZE FOR UNKNOWN)
in regular TSQL, but there is no LINQ-to-SQL / EF way of expolsing this.
My plan would be:
- use profiling to prove that the time is being lost in the query (as opposed to materialization etc)
- once confirmed, consider using direct TSQL methods to invoke
For example, with LINQ-to-SQL, ctx.ExecuteQuery<YourType>(tsql, arg0, ...)
can be used to throw raw TSQL at the server (with parameters as {0}
etc, like string.Format
). Personally, I'd lean towards "dapper" instead - very similar usage, but a faster materializer (but it doesn't support EntityRef<>
etc for lazy-loading values - which is usually a bad thing anyway as it leads to N+1).
i.e. (with dapper)
List<table> list = ctx.Query<table>(@"
select * from table
where test == @someString
and date >= @dateStartInt
and date <= @dateEndInt
OPTION (OPTIMIZE FOR UNKNOWN)",
new {someString, dateStartInt, dateEndInt}).ToList();
or (LINQ-to-SQL):
List<table> list = ctx.ExecuteQuery<table>(@"
select * from table
where test == {0}
and date >= {1}
and date <= {2}
OPTION (OPTIMIZE FOR UNKNOWN)",
someString, dateStartInt, dateEndInt).ToList();
精彩评论