开发者

Log processing with LINQ

开发者 https://www.devze.com 2022-12-16 18:49 出处:网络
I know that it might not be the most performant, but I want to process some logs with a LINQ statement.Here is what the log looks like:

I know that it might not be the most performant, but I want to process some logs with a LINQ statement. Here is what the log looks like:

RECORD  DEVON   1   6748
bla bla bla bla bla bla
 bla bla bla bla bla bla
RECORD  JASON   1   7436
bla bla bla bla bla bla
 bla bla bla bla bla bla
RECORD  DEVON   2   9123
RECORD  DEVON   3   3723
RECORD  SHERRIE 1   6434
RECORD  DEVON   4   3732
bla bla bla bla bla bla
 bla bla bla bla bla bla
bla bla bla bla bla bla
RECORD  SHERRIE 2   6434
 bla bla bla bla bla bla
bla bla bla bla bla bla
 bla bla bla bla bla bla
 bla bla bla bla bla bla
RECORD  SHERRIE 3   9123
RECORD  DEVON   5   3723
 bla bla bla bla bla bla
RECORD  JASON   2   9123
RECORD  DEVON   6   3723
 bla bla bla bla bla bla
 bla bla bla bla bla bla
RECORD  JASON   3   9123

Now I want to filter out anything that doesn't start with RECORD, and group it by the name column (JASON, DEVON, SHERRIE), and then cross join it by name so it looks like this:

DEVON   JASON   SHERRIE
1/6748  1/7436  1/6434
2/9123  2/9123  2/6434
3/3723  3/9123  3/9123
4/3732      
5/3723      
6/3723      

Is this possible t开发者_C百科o do in a single LINQ statement?


You can get the results in rows in one go with Linq (here I'm using the method notation):

string[] lines = File.ReadAllLines("input.txt");
var result =
    lines.Where(line => line.Substring(0, 6) == "RECORD")
         .Select(line => line.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries))
         .GroupBy(columns => columns[1],
                  columns => columns[2] + "/" + columns[3])
         .Select(group => group.Key + " " + string.Join(", ", group.ToArray()));

Output:

DEVON 1/6748, 2/9123, 3/3723, 4/3732, 5/3723, 6/3723
JASON 1/7436, 2/9123, 3/9123
SHERRIE 1/6434, 2/6434, 3/9123

I think it's difficult to transpose the rows to columns without a standard Zip function though. Maybe this is good enough for you? If not, then you will probably have to do the last bit with a helper method that iterates over the separate IEnumerables.


Here is what I came up with:

public static string TransformLog(string fileName)
{
    const string tab = "\t";

    var fileLines = File.ReadAllLines(fileName);

    var testAreas = fileLines
        .Where(l => l.StartsWith("RECORD" + tab))
        .Select(l => l.Split(tab.ToCharArray()).Skip(1).Take(3).ToArray())
        .GroupBy(l => l[0])
        .Select(g => new { g.Key, Enumerator = g.GetEnumerator() })
        .ToList();

    var sb = new StringBuilder();

    testAreas.ForEach(ta => sb.Append(ta.Key + tab + tab));

    sb.AppendLine();

    bool cont;

    do
    {
        cont = false;

        testAreas.ForEach(ta =>
                              {
                                  var hasNext = ta.Enumerator.MoveNext();
                                  sb.Append((hasNext ? ta.Enumerator.Current[1] + tab + ta.Enumerator.Current[2] + tab : tab + tab));
                                  cont |= hasNext;
                              });

        sb.AppendLine();

    } while (cont);

    return sb.ToString();
}
0

精彩评论

暂无评论...
验证码 换一张
取 消