I'm using the AForge library framework and its neural network.
At the moment when I train my network I create lots of images (one image per letter per font) at a big size (30 pt), cut out the actual letter, scale this down to a smaller size (10x10 px) and then save it to my harddisk. I can then go and read all those images, creating my double[] arrays with data. At the moment I do this on a pixel basis.
So once I have successfully trained my network I test the network and let it run on a sample image with the alphabet at different sizes (uppercase and lowercase).
But the result is not really promising. I trained the network so that RunEpoch had an error of about 1.5 (so almost no error), but there are still some letters left that do not get identified correctly in my test image.
Now my question is: Is this caused because I have a faulty learning method (pixelbased vs. the suggested use of receptors in this article: http://www.codeproject.com/KB/cs/neural_network_ocr.aspx - are there other methods I can use to extract the data for the network?) or can this happen because my segmentation-algorithm to extract the letters from the image to look at is bad?
开发者_如何学编程Does anyone have ideas on how to improve it?
I would try making your network inputs scale invariant. In other words preprocessing the objects you find in the test image to segment out individual candidate letter objects and resize them to be the same size as your training set. From your description your not doing this. I'm not familiar with AForge so maybe this is implied in your question.
My experience with neural networks was that preprocessing the input data usually leads to much better results if there's a known good way to do this. Here it would seem that there is.
精彩评论