I have a CSV file with possibl开发者_运维百科y missing data, and the data is both chars and numbers. What is the best way to deal with this?
Here is an example:
file.csv
name,age,gender
aaa,20,m
bbb,25,
ccc,,m
ddd,40,f
readMyCSV.m
fid = fopen('file.csv','rt');
C = textscan(fid, '%s%f%s', 'Delimiter',',', 'HeaderLines',1, 'EmptyValue',NaN);
fclose(fid);
[name,age,gender] = deal(C{:});
The data read:
>> [name num2cell(age) gender]
ans =
'aaa' [ 20] 'm'
'bbb' [ 25] ''
'ccc' [NaN] 'm'
'ddd' [ 40] 'f'
What @Amro has suggested is the most common way to read a csv file with missing values. In you case since your data types are both characters and numbers you should provide the proper format of each column. So your function should look something like this:
C = textscan(fid, '%d32 %c %d8 %d8 %d32 %f32 %f %s ','HeaderLines', 1, 'Delimiter', ',');
for more data formats look here: http://www.mathworks.com/help/techdoc/ref/textscan.html
精彩评论