开发者

opencsv in java ignores backslash in a field value

开发者 https://www.devze.com 2023-03-06 13:12 出处:网络
I am reading a csv file using opencsv. I am ignoring the first line of; the csv file is tab separated with some values enclosed in double quotes.

I am reading a csv file using opencsv.

I am ignoring the first line of; the csv file is tab separated with some values enclosed in double quotes.

The problem occurs when I read the values of a column that has the '\' character, this is stripped out of the value.

reader = new CSVReader(new FileReader(exchFileObj),'\t','"',1);

For example in original file:

address = 12\91buenosaires   

It becomes as:

address = 1291buenosiares

In 开发者_如何学Pythonthe string array that csvreader generates. How do I modify it to be able to read the '\' character also?


I had the same problem and couldn't find another character I could guarantee wouldn't show up in my csv file. According to a post on sourceforge though, you can use the explicit constructor with a '\0' to indicate that you don't want any escape character.

http://sourceforge.net/tracker/?func=detail&aid=2983890&group_id=148905&atid=773542

CSVParser parser = new CSVParser(CSVParser.DEFAULT_SEPARATOR, CSVParser.DEFAULT_QUOTE_CHARACTER, '\0', CSVParser.DEFAULT_STRICT_QUOTES);

I did a bit of cursory testing, and this seems to work just fine, at least backslashes certainly make it through.


CSVReader also has a parser builder via which you can set the escape character to use. If you use that and set the escape character to something you don't use you will get the backslash character in your input.


In addition to @JMM 's answer, you have to use this created CSVParser in the constructor of the CSVReader. The only available constructor is:

public CSVReader(Reader reader, int line, CSVParser csvParser)

You can set the line to 0 so that it will not skip anything


Note: I think the solution in this answer is better than the three alternatives in that it configures a compliant reader in a coarse-grained manner, by relying on the RFC. The other answers go into the details of configuring an escape character. While that works, it seems more like a white-box solution.

By default, OpenCSV's reader does not comply with the writer. The reader is not RFC-compliant. Don't ask me why that is, as I find it as troubling and perplexing as you.

The solution is for you to configure your CSVReader with an RFC-compliant parser:

RFC4180Parser rfc4180Parser = new RFC4180ParserBuilder().build();
CSVReaderBuilder csvReaderBuilder =
  new CSVReaderBuilder(new StringReader(writer.toString()))
      .withCSVParser(rfc4180Parser);
reader = csvReaderBuilder.build();

Here is the source page for the above.


My opencsv version is 5.4, the following code works fine.

CSVParser csvParser = new CSVParserBuilder().withSeparator(',').withEscapeChar('\0').build();
0

精彩评论

暂无评论...
验证码 换一张
取 消