开发者

Rapid miner: CSV with real numbers with commas instead of dots

开发者 https://www.devze.com 2023-03-03 13:22 出处:网络
I have a problem importing a CSV file with RapidMiner. Floating point values are written with commas instead of the separating dot between the integer and decimal values开发者_开发知识库.

I have a problem importing a CSV file with RapidMiner. Floating point values are written with commas instead of the separating dot between the integer and decimal values开发者_开发知识库.

Anyone know how to import correctly the values formatted in this way?

sample data:

BMI;1;0;1;1;1;blue;-0,138812155;0,520378909;5;0;50;107;0;9;0;other;good;2011 BMI;1;0;1;1;1;pink;-0,624654696;;8;0;73;120;1;3;0,882638889;other;good;2011

Rapid miner actually interprets it as "polynomial". Forcing it to "real" leads only to a correct interpretation of the "0" value.

thanks


This seems to be a very old request. Not sure if this will help you, but this may help others with a similar situation.

Step 1: in the "Read CSV" operator, under "import configuration wizard", make sure you select "Semicolon" as the separator

Step 2: use the "Guess Types" operator. Attribute Filter Type -> Subset, Select Attributes -> select the attributes 8, 9 and 16 (based on your example above), change "decimal point character" to a "," and you should be all set.

Hope this helps (someone!)


Use semi-colon as the delimiter. You can use java.util.Scanner to read each line. String.split() to split on the semi-colon. When you get a token with a comma you can use String.replace() to change the comma to a decimal. Then you can use Float.parseFloat()

Hope this answers you question.


public static void main(String args){
    BufferedReader br = new BufferedReader(new FileReader("c:\\path\\semicolons and numbers and commas.csv"));
    try {
        for(String line; (line=br.readLine()) != null);) {
            //Variable line now has a single line from the file. This code will execute for each line.
            String array = line.split(";");// Split on the semicolon. Beware of changing this. This uses regex which means that some characters mean something like . means anything, not just dots.
            double firstDouble = Double.parseDouble(array[7].replace(',','.')); // Get field 7 (the eighth field) and turn it into a double (high precision floating point). Replace , with . so it will not make an error
            System.err.println("Have a number " + firstDouble);
            System.err.println("Can play with it " + (firstDouble * 2.0));
        }
    }finally{
        br.close(); // Free resources (and unlock file on Windows).
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消