开发者

Is LINQ lazy-evaluated?

开发者 https://www.devze.com 2023-02-08 07:35 出处:网络
Greetings, I have the following question. I did not find 开发者_C百科the exact answer for it, and it\'s really interesting to me. Suppose I have the following code that retrieves the records from data

Greetings, I have the following question. I did not find 开发者_C百科the exact answer for it, and it's really interesting to me. Suppose I have the following code that retrieves the records from database (in order to export it to XML file, for example).

var result = from emps in dc.Employees
  where emps.age > 21
  select emps;

foreach (var emp in result) {
  // Append this record in suitable format to the end of XML file
}

Suppose there is a million of records that satisfy the where condition in the code. What will happen? All this data will be retrieved from SQL Server immediately to the runtime memory when it reaches the foreach construct, or it will be retrieved then necessary, the first record, second. In other words, does LINQ really handle the situation with iterating through large collections (see my post here for details)?

If not, how to overcome the memory issues in that case? If I really need to traverse the large collection, what should I do? Calculate the actual amount of elements in collection with help of Count function, and after that read the data from the database by small portions. Is there an easy way to implement paging with LINQ framework?


All the data will be retrieved from SQL Server, at one time, and put into memory. The only way around this that I can think of is to process data in smaller chunks (like page using Skip() and Take()). But, of course, this requires more hits to SQL Server.

Here is a Linq paging extension method I wrote to do this:

public static IEnumerable<TSource> Page<TSource>(this IEnumerable<TSource> source, int pageNumber, int pageSize)
    {
        return source.Skip((pageNumber - 1) * pageSize).Take(pageSize);
    }


Yes, LINQ uses lazy evaluation. The database would be queried when the foreach starts to execute, but it would fetch all the data in one go (it would be much less efficient to do millions of queries for just one result at a time).

If you're worried about bringing in too many results in one go, you could use Skip and Top to only get a limited number of results at a time (thus paginating your result).


It'll be retrieved when you invoke ToList or similar methods. LINQ has deferred execution:

  • http://weblogs.asp.net/psteele/archive/2008/04/18/linq-deferred-execution.aspx

The way - even having deferred execution and loading the entire collection from a data source in the case of an OR/M or any other LINQ provider - will be determined by the implementer of the LINQ object source.

That's, for example, some OR/M may provide lazy-loading, and that means your "entire list of customers" would be something like a cursor, and accessing one of items (an employee), and also one property, will load the employee itself or the accessed property only.

But, anyway, these are the basics.

EDIT: Now I see it's a LINQ-to-SQL thing... Or I don't know if question's author misunderstood LINQ and he doesn't know LINQ isn't LINQ-to-SQL, but it's more a pattern and a language feature.


OK, now thanks to this answer I have an idea - how about mixing the function of taking a page with yield return possibilities? Here is the sample of code:

// This is the original function that takes the page
public static IEnumerable<TSource> Page<TSource>(this IEnumerable<TSource> source, int pageNumber, int pageSize) {
  return source.Skip((pageNumber - 1) * pageSize).Take(pageSize);
}

// And here is the function with yield implementation
public static IEnumerable<TSource> Lazy<TSource>(this IEnumerable<TSource> source, int pageSize) {
  int pageNumber = 1;
  int count = 0;

  do {
    IEnumerable<TSource> coll = Page(source, pageNumber, pageSize);
    count = coll.Count();
    pageNumber++;
    yield return coll;
  } while (count > 0);
}


// And here goes our code for traversing collection with paging and foreach
var result = from emps in dc.Employees
  where emps.age > 21
  select emps;

// Let's use the 1000 page size
foreach (var emp in Lazy(result, 1000)) {
  // Append this record in suitable format to the end of XML file
}

I think this way we can overcome the memory issue, yet leaving the syntaxis of foreach not so complicated.

0

精彩评论

暂无评论...
验证码 换一张
取 消