i'm using the tessnet2 wrapper to the Tesseract 2.04 Source on windows XP, configured it to work with x86.
TessarctTest pr开发者_高级运维oject main function contains:
Bitmap bmp = new Bitmap(@"C:\temp\New Folder\dotnet\eurotext.tif");
tessnet2.Tesseract ocr = new tessnet2.Tesseract();
// ocr.SetVariable("tessedit_char_whitelist", "0123456789");
ocr.Init(@"C:\temp\tessdata", "eng", false);
// List<tessnet2.Word> r1 = ocr.DoOCR(bmp, new Rectangle(792, 247, 130, 54));
List<tessnet2.Word> r1 = ocr.DoOCR(bmp, Rectangle.Empty);
int lc = tessnet2.Tesseract.LineCount(r1);
when i try to run the program it crashes on the following line inside the ocr.Init
int result = m_myTessBaseAPIInstance->InitWithLanguage((char *)_tessdata.ToPointer(), NULL, (char *)_lang.ToPointer(), NULL, numericMode, 0, NULL);
Any one has an idea?
Appreciate!
For anyone still having a problem after all these, make sure if you're using tessnet2 that you download the correct language files.
You want English language data for Tesseract (2.00 and up) and not the English language data for Tesseract 3.01 version. I hope this saves you a few hours! :)
For those attempting to use the Tessnet2 assembly for the Tesseract OCR engine in C# and who are running into the problem of the Tesseract.Init()
method causing your app to crash - I found one possible cause.
First, I'm assuming you have the files as follows:
bin\Debug\MyDotNetApp.exe
bin\Debug\tessdata\eng.DangAmbigs
bin\Debug\tessdata\eng.freq-dawg
bin\Debug\tessdata\eng.inttemp
bin\Debug\tessdata\eng.pffmtable
bin\Debug\tessdata\eng.unicharset
bin\Debug\tessdata\eng.user-words
bin\Debug\tessdata\eeng.word-dawg
And are using this for the initialization:
using (var ocr = new tessnet2.Tesseract())
{
ocr.Init(null, "eng", false);
...
}
In theory that should work. For me it did work - but then it didn't all of a sudden... even though I didn't change anything that would affect it.
For me the fix was to search through the registry (using regedit) and remove all references to tesseract. There were some suspicious entries that I think may have been created when I installed the Tesseract 3.00 installer (tesseract-ocr-setup-3.00.exe).
When I deleted those entries and rebooted (I had tried rebooting before removing the reg entries, FYI), everything worked again.
Were the registry entries causing the problem? Who knows. But it did fix my problem.
Project + Properties, Debug tab, scroll down, tick the "Enable unmanaged code debugging" checkbox. Now you can set a breakpoint and debug it.
If your IDE doesn't support mixed mode debugging, you can attach a debugger using the technique outlined in this post.
Make sure your tessdata folder (C:\temp\tessdata) contains the english language data files. The files are: eng.DangAmbigs, eng.freq-dawg, eng.inttemp, eng.normproto, eng.pffmtable, eng.unicharset, eng.user-words, eng.word-dawg. download the files from tesseract downloads. The file to download is tesseract-2.00.eng.tar.gz.
In my case the answer from dkr88 did the job, thanks a lot. I guess there some dependency corrupt when tesseract was installed as a standalone before. Furthermore, the OCR-quality seems to be better than with MODI although tiltcorrection os the latter is working under more extreme circumstances (vertical text).
I'm pretty happy with tessnet2 now. There is only one drawback: I needed to change my app.config (as described on the internet) and added the following:
<startup useLegacyV2RuntimeActivationPolicy="true">
<supportedRuntime version="v4.0"/>
</startup>
My problem is that I wasn't running the application with Administrator permissions.
When I right clicked run as and chose Local Administrator it worked.
In my case, I did the below changes to get it work :)
- Downloaded https://tesseract-ocr.googlecode.com/files/tesseract-2.00.eng.tar.gz
- Pasted tessdata folder to my Debug folder
- And did the following code changes
ocr.Init("D:\MyApplication\MyApplication\Debug", "eng", false);
to
ocr.Init(null, "eng", false);
In my case I set the tessdata files to copy always, and then it didn't crash on the init line.
精彩评论