List<object>.RemoveAll - How to create an appropriate Predicate_问答_开发者

This is a bit of noob question - I'm still fairly new to C# and generics and completely new to predicates, delegates and lambda expressions...

I have a class 'Enquiries' which contains a generic list of another class called 'Vehicles'. I'm building up the code to add/edit/delete Vehicles from the parent Enquiry. And at the moment, I'm specifically looking at deletions.

From what I've read so fa开发者_如何学运维r, it appears that I can use Vehicles.RemoveAll() to delete an item with a particular VehicleID or all items with a particular EnquiryID. My problem is understanding how to feed .RemoveAll the right predicate - the examples I have seen are too simplistic (or perhaps I am too simplistic given my lack of knowledge of predicates, delegates and lambda expressions).

So if I had a List<Of Vehicle> Vehicles where each Vehicle had an EnquiryID, how would I use Vehicles.RemoveAll() to remove all vehicles for a given EnquiryID?

I understand there are several approaches to this so I'd be keen to hear the differences between approaches - as much as I need to get something working, this is also a learning exercise.

As an supplementary question, is a Generic list the best repository for these objects? My first inclination was towards a Collection, but it appears I am out of date. Certainly Generics seem to be preferred, but I'm curious as to other alternatives.

The RemoveAll() methods accept a Predicate<T> delegate (until here nothing new). A predicate points to a method that simply returns true or false. Of course, the RemoveAll will remove from the collection all the T instances that return True with the predicate applied.

C# 3.0 lets the developer use several methods to pass a predicate to the RemoveAll method (and not only this one…). You can use:

Lambda expressions

vehicles.RemoveAll(vehicle => vehicle.EnquiryID == 123);

Anonymous methods

vehicles.RemoveAll(delegate(Vehicle v) {
  return v.EnquiryID == 123;
});

Normal methods

vehicles.RemoveAll(VehicleCustomPredicate);
private static bool
VehicleCustomPredicate (Vehicle v) {
    return v.EnquiryID == 123; 
}

A predicate in T is a delegate that takes in a T and returns a bool. List<T>.RemoveAll will remove all elements in a list where calling the predicate returns true. The easiest way to supply a simple predicate is usually a lambda expression, but you can also use anonymous methods or actual methods.

{
    List<Vehicle> vehicles;
    // Using a lambda
    vehicles.RemoveAll(vehicle => vehicle.EnquiryID == 123);
    // Using an equivalent anonymous method
    vehicles.RemoveAll(delegate(Vehicle vehicle)
    {
        return vehicle.EnquiryID == 123;
    });
    // Using an equivalent actual method
    vehicles.RemoveAll(VehiclePredicate);
}

private static bool VehiclePredicate(Vehicle vehicle)
{
    return vehicle.EnquiryID == 123;
}

This should work (where enquiryId is the id you need to match against):

vehicles.RemoveAll(vehicle => vehicle.EnquiryID == enquiryId);

What this does is passes each vehicle in the list into the lambda predicate, evaluating the predicate. If the predicate returns true (ie. vehicle.EnquiryID == enquiryId), then the current vehicle will be removed from the list.

If you know the types of the objects in your collections, then using the generic collections is a better approach. It avoids casting when retrieving objects from the collections, but can also avoid boxing if the items in the collection are value types (which can cause performance issues).

Little bit off topic but say i want to remove all 2s from a list. Here's a very elegant way to do that.

void RemoveAll<T>(T item,List<T> list)
{
    while(list.Contains(item)) list.Remove(item);
}

With predicate:

void RemoveAll<T>(Func<T,bool> predicate,List<T> list)
{
    while(list.Any(predicate)) list.Remove(list.First(predicate));
}

+1 only to encourage you to leave your answer here for learning purposes. You're also right about it being off-topic, but I won't ding you for that because of there is significant value in leaving your examples here, again, strictly for learning purposes. I'm posting this response as an edit because posting it as a series of comments would be unruly.

Though your examples are short & compact, neither is elegant in terms of efficiency; the first is bad at O(n²), the second, absolutely abysmal at O(n³). Algorithmic efficiency of O(n²) is bad and should be avoided whenever possible, especially in general-purpose code; efficiency of O(n³) is horrible and should be avoided in all cases except when you know n will always be very small. Some might fling out their "premature optimization is the root of all evil" battle axes, but they do so naïvely because they do not truly understand the consequences of quadratic growth since they've never coded algorithms that have to process large datasets. As a result, their small-dataset-handling algorithms just run generally slower than they could, and they have no idea that they could run faster. The difference between an efficient algorithm and an inefficient algorithm is often subtle, but the performance difference can be dramatic. The key to understanding the performance of your algorithm is to understand the performance characteristics of the primitives you choose to use.

In your first example, list.Contains() and Remove() are both O(n), so a while() loop with one in the predicate & the other in the body is O(n²); well, technically O(m*n), but it approaches O(n²) as the number of elements being removed (m) approaches the length of the list (n).

Your second example is even worse: O(n³), because for every time you call Remove(), you also call First(predicate), which is also O(n). Think about it: Any(predicate) loops over the list looking for any element for which predicate() returns true. Once it finds the first such element, it returns true. In the body of the while() loop, you then call list.First(predicate) which loops over the list a second time looking for the same element that had already been found by list.Any(predicate). Once First() has found it, it returns that element which is passed to list.Remove(), which loops over the list a third time to yet once again find that same element that was previously found by Any() and First(), in order to finally remove it. Once removed, the whole process starts over at the beginning with a slightly shorter list, doing all the looping over and over and over again starting at the beginning every time until finally no more elements matching the predicate remain. So the performance of your second example is O(m*m*n), or O(n³) as m approaches n.

Your best bet for removing all items from a list that match some predicate is to use the generic list's own List<T>.RemoveAll(predicate) method, which is O(n) as long as your predicate is O(1). A for() loop technique that passes over the list only once, calling list.RemoveAt() for each element to be removed, may seem to be O(n) since it appears to pass over the loop only once. Such a solution is more efficient than your first example, but only by a constant factor, which in terms of algorithmic efficiency is negligible. Even a for() loop implementation is O(m*n) since each call to Remove() is O(n). Since the for() loop itself is O(n), and it calls Remove() m times, the for() loop's growth is O(n²) as m approaches n.

I wanted to address something none of the answers have so far:

From what I've read so far, it appears that I can use Vehicles.RemoveAll() to delete an item with a particular VehicleID. As an[sic] supplementary question, is a Generic list the best repository for these objects?

Assuming VehicleID is unique as the name suggests, a list is a terribly inefficient way to store them when you get a lot of vehicles, as removal(and other methods like Find) is still O(n). Have a look at a HashSet<Vehicle> instead, it has O(1) removal(and other methods) using:

int GetHashCode(Vehicle vehicle){return vehicle.VehicleID;}
int Equals(Vehicle v1, Vehicle v2){return v1.VehicleID == v2.VehicleID;}

Removing all vehicles with a specific EnquiryID still requires iterating over all elements this way, so you could consider a GetHashCode that returns the EnquiryID instead, depending on which operation you do more often. This has the downside of a lot of collisions if a lot of Vehicles share the same EnquiryID though.

In this case, a better alternative is to make a Dictionary<int, List<Vehicle>> that maps EnquiryIDs to Vehicles and keep that up to date when adding/removing vehicles. Removing these vehicles from a HashSet is then an O(m) operation, where m is the number of vehicles with a specific EnquiryID.