good in training and bad in prediction_问答_开发者

开发者 https://www.devze.com 2023-01-26 09:50 出处：网络

I\'ve written a code for prediction in neural network... the error in training is good ( below 开发者_JAVA百科1 %) but for prediction the error is high ( about 20 %)...I think my network is over trai

I've written a code for prediction in neural network...

the error in training is good ( below 开发者_JAVA百科1 %) but for prediction the error is high ( about 20 %)...I think my network is over trained but i don't know a way to solve this problem...I've changed number of layers,number of neurons and training function but the result has not changed...

so I put my code in this forum and hope to get an answer for it: link text

this zip file contains 2 files:

1-an Excel file for datas : lines 1-4 for training input, line 5 for training output (line 6 is output but didn't use in this code), line 7-10 for testing input, line 11 for testing output.

2-matlab code

after running the program 4 chart appear: first row is for trained data and second row for tested datas.

if somebody knows the answer, please change my code and put it again.

thanks a lot.

EDIT:

more description:

I have 2 output and two code for each...for line 6 ( second output) this code has acceptable results but for line 5 no good result...

please change my code and apply your suggestions on it and put it here if you think that your suggestion is useful...I have received some suggestions in other forums that are general solutions with no influence on results...

As people have mentioned you are likely overfitting the ANN to the training data. Depending on the dataset you might get an arbitrarily good fit of the training data if you train long enough.. Another problem could be that the training data does not properly represent the problem space. I.e. there are inputs in the test data that are very dissimilar to the data you used for training. If that is the case there is no way the ANN can function adequately.

In order to overcome the overfitting, try this. Split the data into 3 sets; training, validation and testing. While training the ANN also calculate the error on the validation set. If the validation set does not improve for, let's say, 5 epochs (you can always configure this) then stop training.

Also, as a general point. I did not have a chance to look at your data and source code, but remember that you will need a significant amount of data in order to get good results. If you only have a few datapoints then it will be very hard/impossible to achieve good results.

I recommend reading the guide here for a good overview of many aspects of ANNs.

Good luck!

If you believe the problem might be overtraining, try training them until they have 5%, 10% error instead of 1%. The lower your error percentage, the more difficult will be for them to generalize -- they'll just know to recognize EXACTLY what you gave them.

If you are using Matlab try training your network with Bayesian regularization instead of default Levenberg-Marquardt algorithm (net.trainFcn = 'trainbr' instead of trainlm)