I am now using libsvm for support vector machine classifier with Gaussian kernal. In its website, it provides a python script grid.py to select the best C and gamma.
I just wonder how training time and overfitt开发者_如何学Pythoning/underfitting change with gamma and C?
Is it correct that:
suppose C changes from 0 to +infinity, the trained model will go from underfitting to overfitting, and the training time increases?
suppose gamma changes from almost 0 to +infinity, the trained model will go from underfitting to overfitting, and the training time increases?
In grid.py, the default searching order is for C from small to big BUT gamma from big to small. Is it for the purpose of training time from small to big and trained model from underfitting to overfitting? So we can perhaps save time in selecting the values of C and gamma?
Thanks and regards!
Good question for which I don't have a sure answer, because I myself would like to know. But in response to the question:
So we can perhaps save time in selecting the values of C and gamma?
... I find that, with libsvm, there is definitely a "right" value for C and gamma that is highly problem dependent. So regardless of the order in which gamma is searched, many candidate values for gamma must be tested. Ultimately, I don't know any shortcut around this time-consuming (depending upon your problem) but necessary parameter search.
精彩评论