Possible Duplicate:
Dealing with commas in a CSV file
I wrote myself a CSV parser it works fine until I hit this record:
B002VECGTG,B002VECGTG,HAS_17131_spaceshooter,"4,426",0.04%,"4,832",0.03%,0%,1,0.02%,$20.47 ,1
The escaped , in "4,426" and in "4,426" brake my parser.
This is what I am using to parse the line of text:
开发者_开发问答 char[] comma = { ',' };
string[] words = line.Split(comma);
How do I prevent my program from breaking?
You can't just split on comma. To implement a proper parser for that case, you need to loop through the string yourself, keeping track of whether you are inside quotes or not. If you are inside a quoted string, you should keep on until you find another quote.
IEnumerable<string> LineSplitter(string line)
{
int fieldStart = 0;
for(int i = 0; i < line.Length; i++)
{
if(line[i] == ',')
{
yield return line.SubString(fieldStart, i - fieldStart);
fieldStart = i + 1;
}
if(line[i] == '"')
for(i++; line[i] != '"'; i++) {}
}
}
I suggest using a CSV parser instead of trying to parse by yourself.
There are some nuances to parsing CSV correctly, as you have already found out.
There are many third party ones (and several of these are free), and even one built into the Visual Basic namespace - the TextFieldParser
in the Microsoft.VisualBasic.FileIO
namespace.
It is possible to use a Regex:
List<List<String>> rows = new List<List<String>>();
MatchCollection matches = Regex.Matches(input, @"^(?:(?:\s*""(?<value>[^""]*)""\s*|(?<value>[^,]*)),)*?(?:\s*""(?>value>[^""]*)""\s*|(?<value>[^,]*))$", RegexOptions.Multiline);
foreach(Match row in matches)
{
List<String> values = new List<String>();
foreach(Capture value in row.Groups["value"].Captures)
{
values.Add(value.Value);
}
rows.Add(values);
}
I do not suggest that it is the best solution, but for small files (a couple of rows) it probably isn't too bad.
精彩评论