I don't understand about how to split continous attribute in CART (Classification and Regression Tree) algorithm, as we know that CART can both split categorical and continous attribute.
i have read many papers and it says the value to be split point is the middle value in sequence. i don't understand about i开发者_StackOverflow社区t. could you explain to me what that means, and give me some examples?
thanks
The general process is to scan through candidate splitting values on any given predictor, measure the quality of each split and select the best one. For efficiency's sake, the scan may not try every possible split but instead try every percentile or some other reduced set of choices. The quality of any split can be measured any number of ways, such as information gain, twoing, etc.
If you are talking specifically about the CART algorithm originally described by Breiman, Friedman, Stone Olshen, then check their book, "Classification and Regression Trees" (1984).
精彩评论