开发者

fine and fast csv reader

开发者 https://www.devze.com 2023-01-27 00:27 出处:网络
I m using 开发者_如何学编程Lumenworks CSV reader and i must say i m not very happy with it with how it works so far.

I m using 开发者_如何学编程Lumenworks CSV reader and i must say i m not very happy with it with how it works so far.

I m parsing thousands of csv files within an hour and there is always a problem, either throws an exception complaining about bad records or skewing the columns etc.

Can you recommend a fine CSV reader, it doesnt have to be a free one, but bug free.

Thank you.


FileHelpers Open Source Library http://www.filehelpers.net/


Try CsvHelper (a library I maintain). It's also available on NuGet.


You cite that you are receiving exceptions and such from the files. While these may be undesired, have you investigated the cause?

You might just want to use one of the current parsers that are on the table and when an exception occurs, try an alternative or/and handle the scenarios with custom code. I know it's not exactly what you are looking for but the problem may not be the tools you are using but the input the tools are receiving...

You could also move the offending file to a separate directory (in code) to look at a bit later and get what will process, processed.


There is a CSV parser built into .NET.

From http://coding.abel.nu/2012/06/built-in-net-csv-parser/:

// TextFieldParser is in the Microsoft.VisualBasic.FileIO namespace.
using (TextFieldParser parser = new TextFieldParser(path))
{
    parser.CommentTokens = new string[] { "#" };
    parser.SetDelimiters(new string[] { ";" });
    parser.HasFieldsEnclosedInQuotes = true;

    // Skip over header line.
    parser.ReadLine();

    while (!parser.EndOfData)
    {
        string[] fields = parser.ReadFields();
        yield return new Brand()
        {
            Name = fields[0],
            FactoryLocation = fields[1],
            EstablishedYear = int.Parse(fields[2]),
            Profit = double.Parse(fields[3], swedishCulture)
        };
    }
}


You have to check the input files.I think these tool don't stop at format check because they aim for quantity stuff (skiping erroneous data to treat the maximum of files). In the real world you rarely see a stream of clean CSV. Drivers tend to give their own kind:

-no quotes

-semi colon instead of comma

Files that are generating errors usually come from the same source.


It's been a long time since I used it, but FileHelpers does CSV parsing with lots of options.

0

精彩评论

暂无评论...
验证码 换一张
取 消