开发者

finding all lines in a list that contain x or y?

开发者 https://www.devze.com 2023-01-23 04:25 出处:网络
can I do this without looping through the whole list? List<string> responseLines = new List<string>();

can I do this without looping through the whole list?

List<string> responseLines = new List<string>();

the list is then filled with around 300 lines of text.

next I want to search the list开发者_StackOverflow社区 and create a second list of all lines that either start with "abc" or contain "xyz".

I know I can do a for each but is there a better / faster way?


You could use LINQ. This is no different performance-wise to using foreach -- that's pretty much what it does behind the scenes -- but you might prefer the syntax:

var query = responseLines.Where(s => s.StartsWith("abc") || s.Contains("xyz"))
                         .ToList();

(If you're happy dealing with an IEnumerable<string> rather than List<string> then you can omit the final ToList call.)


var newList = (from line in responseLines
              where line.StartsWith("abc") || line.Contains("xyz")
              select line).ToList();


Try this:

List<string> responseLines = new List<string>();
List<string> myLines = responseLines.Where(line => line.StartsWith("abc", StringComparison.InvariantCultureIgnoreCase) || line.Contains("xyz")).ToList();

The StartsWith and Contains shortcut - the Contains will only evaluate if the StartsWith is not satisfied. This still iterates the whole list, but of course there is no way to avoid that if you want to check the whole list, but it saves you from doing typing a foreach.


Use LINQ:

List<string> list = responseLines.Where(x => x.StartsWith("abc") || x.Contains("xyz")).ToList();


Unless you need all the text for some reason, it would be quicker to inspect each line at the time when you were generating the List and discard the ones that don't match without ever adding them.

This depends on how the List is loaded as well - that code is not shown. This would be effective if you were reading from a text file since then you could just use your LINQ query to operate directly on the input data using File.ReadLines as the source instead of the final List<string>.

    var query = File.ReadLines("input.txt").
        Where(s => s.StartsWith("abc") || s.Contains("xyz"))
        .ToList();


LINQ works well as far as offering you improved syntax for this sort of thing (See LukeH's answer for a good example), but it isn't any faster than iterating over it by hand.

If you need to do this operation often, you might want to come up with some kind of indexed data structure that watches for all "abc" or "xyz" strings as they come into the list, and can thereby use a faster algorithm for serving them up when asked, rather than iterating through the whole list.

If you don't have to do it often, it's probably a "premature optimization."


Quite simply, there is no possible algorithm that can guarantee you will never have to iterate through every item in the list. However, it is possible to improve the average number of items you need to iterate through - sorting the list before you begin your search. By doing so, the only times you would have to iterate through the entire list would be when it is filled with only "abc" and "xyz."

Assuming that it's not practical for you to have a pre-sorted list by the time you need to search through it, then the only way to improve the speed of your search would be to use a different data structure than a list - for example, a binary search tree.

0

精彩评论

暂无评论...
验证码 换一张
取 消