开发者

Do you choose Linq over Forloops?

开发者 https://www.devze.com 2023-01-09 09:01 出处:网络
Given a datatable containing two columns like this: Private Function CreateDataTable() As DataTable Dim customerTable As New DataTable(\"Customers\")

Given a datatable containing two columns like this:

Private Function CreateDataTable() As DataTable
    Dim customerTable As New DataTable("Customers")
    customerTable.Columns.Add(New DataColumn("Id", GetType(System.Int32)))
    customerTable.Columns.Add(New DataColumn("Name", GetType(System.String)))

    Dim row1 = customerTable.NewRow()
    row1.Item("Id") = 1
    row1.Item("Name") = "Customer 1"
    customerTable.Rows.Add(row1)

    Dim row2 = customerTable.NewRow()
    row2.Item("Id") = 2
    row2.I开发者_JAVA技巧tem("Name") = "Customer 2"
    customerTable.Rows.Add(row2)

    Dim row3 = customerTable.NewRow()
    row3.Item("Id") = 3
    row3.Item("Name") = "Customer 3"
    customerTable.Rows.Add(row3)

    Return customerTable
End Function

Would you use this snippet to retrieve a List(Of Integer) containing all Id's:

Dim table = CreateDataTable()

Dim list1 As New List(Of Integer)

For i As Integer = 0 To table.Rows.Count - 1
    list1.Add(CType(table.Rows(i)("Id"), Integer))
Next

Or rather this one:

Dim list2 = (From r In table.AsEnumerable _
             Select r.Field(Of Integer)("Id")).ToList()

This is not a question about whether to type cast the Id column to Integer by using .Field(Of Integer), CType, CInt, DirectCast or whatever but generally about whether or not you choose Linq over forloops as the subject implies.


For those who are interested: I ran some iterations with both versions which resulted in the following performance graph:

graph http://dnlmpq.blu.livefilestore.com/y1pOeqhqQ5neNRMs8YpLRlb_l8IS_sQYswJkg17q8i1K3SjTjgsE4O97Re_idshf2BxhpGdgHTD2aWNKjyVKWrQmB0J1FffQoWh/analysis.png?psid=1

The vertical axis shows the milliseconds it took the code to convert the rows' ids into a generic list with the number of rows shown on the horizontal axis. The blue line resulted from the imperative approach (forloop), the red line from the declarative code (linq).


Whatever way you generally choose: Why do you go that way and not the other?


Whenever possible I favor the declarative way of programming instead of imperative. When you use a declarative approach the CLR can optimize the code based on the characteristics of the machine. For example if it has multiple cores it could parallelize the execution while if you use an imperative for loop you are basically locking this possibility. Today maybe there's no big difference but I think that in the future more and more extensions like PLINQ will appear allowing better optimization.


I avoid linq unless it helps readability a lot, because it completely destroys edit-and-continue.

When they fix that, I will probably start using it more, because I do like the syntax a lot for some things.


For almost everything I've done I've come to the conclusion that LINQ is optimized enough. If I handcrafted a for loop it would have better performance, but in the grand scheme of things we are usually talking milliseconds. Since I rarely have a situation where those milliseconds will make any kind of impact, I find it's much more important to have readable code with clear intentions. I would much rather have a call that is 50ms slower than have someone come along and break it altogether!


Resharper has a cool feature that will flag and convert loops into Linq expressions. I will flip it to the Linq version and see if that hurts or helps readability. If the Linq expression more clearly communicates the intent of the code, I will go with that. If the Linq expression is unreadable, I will flip back to the foreach version.

Most of the performance issues don't really compare with readability for me.

Clarity trumps cleverness.

In the above example, I would go with the the Linq version since it clearly explains the intent and also locks out people accidently adding side effects in the loop.


I recently found myself wondering whether I've been totally spoiled by LINQ. Yes, I now use it all the time to pick all sort of things out from all sort of collections.


I started to, but found out in some cases, I saved time by using this approach:

for (var i = 0, len = list.Count; i < len; i++) { .. }

Not necessarily in all cases, but some. Most extension methods use the foreach approach of querying.


I try to follow these rules:

  • Whenever I'm just querying (filtering, projecting, ...) collections, use LINQ.
  • As soon as I'm actually 'doing' something with the result (i.e, introduce side effects), I'll use a for loop.

So in this example, I'll use LINQ.

Also, I always try to split up the 'query definition' from the 'query evaluation':

Dim query = From r In table.AsEnumerable() 
            Select r.Field(Of Integer)("Id")

Dim result = query.ToList()

This makes it clear when that (in this case in-memory) query will be evaluated.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号