I am just wondering if there is any DLLs or features in VB.Net 2008 that I could use to parse a picture of text to text (for example, a screenshot), assuming the text are in very recognizable format开发者_StackOverflow (i.e., not like CAPTCHA type of text).
If it is incredibly readable, an unaltered, pure, screenshot, then the easiest (but probably slowest) way is to draw each letter (using Graphics.DrawString
) on to a bitmap and compare that, pixel by pixel, against each pixel. This could be reasonably quick considering how OCR is, and it would almost certainly give a 100% accuracy rate. Even better would be if you're trying to recognize text in a certain area, reducing the search area and increasing speed several times, and even better if the text is in a fixed-width format and you know the font size or can figure it out by searching a small area - you can skip the entire block when a letter is recognized!
If you don't know how to do this type of image manipulation, that's OK. Look at GetPixel
and SetPixel
on MSDN to start out, then move on to the speed section and look for examples using LockBits
.
By far and away your best bet on this one is to buy some OCR software to do it for you. Here's another option, although you'll have to wait: http://www.labnol.org/software/convert-scanned-pdf-images-to-text-with-google-ocr/5158/
精彩评论