I am looking for a programmatic way to retrieve text on the images . I am开发者_JAVA百科 not aware of any such tool if available already. I need to download the images first and then extract text from them . Is there any programmatic way to do so ?
Tesseract OCR can extract text from images. What exactly do you mean by extract?
OCR is a complex technology (image segmentation, angle correction, binarization, characters segmentation, analysis of combined and broken characters, dictionary checking, etc), but there are ready-to-use OCR engines, most of them are commercial, for instance:
- Most accurate (and expensive + royalty) - Abby OCR engine.
- Good accuracy (royalty) - OmniPage OCR engine.
- Good accuracy (royalty-free) - Nicomsoft CrystalOCR engine.
- Acceptable accuracy (free) - Tesseract OCR engine.
精彩评论