开发者

How can I declare a thousand separator in read.csv? [duplicate]

开发者 https://www.devze.com 2022-12-21 22:09 出处:网络
开发者_如何学运维This question already has answers here: How to read data when some numbers contain commas as thousand separator?
开发者_如何学运维This question already has answers here: How to read data when some numbers contain commas as thousand separator? (11 answers) Closed 2 years ago.

The dataset I want to read in contains numbers with and without a comma as thousand separator:

"Sudan", "15,276,000", "14,098,000", "13,509,000"
"Chad", 209000, 196000, 190000

and I am looking for a way to read this data in.

Any hint appreciated!


since there is an "r" tag under the question, I assume this is an R question. In R, you do not need to do anything to handle the quoted commas:

> read.csv('t.csv', header=F)
     V1          V2          V3          V4
1 Sudan  15,276,000  14,098,000  13,509,000
2  Chad      209000      196000      190000

# if you want to convert them to numbers:
> df <- read.csv('t.csv', header=F, stringsAsFactor=F)
> df$V2 <- as.numeric(gsub(',', '', df$V2))


Looking at that set of data you could parse it using ", " (note the extra space) as the seperator intead of ","


You could use the following regular expression to remove the commas and any surrounding quote marks to leave plain csv content

,(?=[0-9])|"

then process it as normal


How about doing it as a two step process. 1. Replace the "," with a TAB character 2. Split on tab.

I'm assuming .NET here but the sample principle would apply in any language

0

精彩评论

暂无评论...
验证码 换一张
取 消