Error : Can not find dictionnary

Discussions about machine vision support in GdPicture.
Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Error : Can not find dictionnary

Post by Nigels » Thu Sep 10, 2009 5:34 pm

Hi

I am testing a VB6 program (using ActiveX version on XP Pro) that takes a 24 page TIFF document and does the following for each page (simplified, but this is the basis):
  • Saves the page as a single page TIF
    Deskews the page
    Detects the page orientation using OCRTesseractGetOrientationEx and rotates the page.
    Does several OCR passes of the page using different Whitelists to try and access different types of data
    If no results are found, rotates the page manually through 90, 180, and 270 degrees and OCR passes the page again each time
I am intermittently (about 1 in every 3 runs) receiving an error when it reaches the 23rd page from the OCRTesseractDoOCR method as follows:
Can not find dictionnary: C:\Documents and Settings\All Users\Application Data\Mardak Client\OCR\eng.unicharset
This file is in the specified folder and is the same one as used to process the other 22 pages successfully (this path does not change during the run).

Before I investigate this further, has anyone else seen this error and if so, what did you do to rectify it?

Many thanks

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Thu Sep 10, 2009 5:45 pm

Hi Nigel,

This error means that the OCR engine is unable to find or open the specified file.

If you are able to send us a basic application reproducing the error and the tiff file we will investigate to see if there is something wrong in the OCR.

Kind regards,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Thu Sep 10, 2009 6:27 pm

Hi Loic

Thank you for the prompt reply.

Because of the many stages this is going through (and gdpicture is wrapped in a VB OCX being called from another VB app), this could be a really difficult one to reproduce - but I will try!

Any ideas or pointers as to where to start looking at what could be causing this?

Cheers

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Thu Sep 10, 2009 6:33 pm

You are welcome Nigel :wink:

To be honest I have not idea on the cause of your problem. We never encounter this error before.

What I can suggest you is to try your application on another computer and to stand by your anti-virus if you have one.

Cheers,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Thu Sep 17, 2009 4:40 pm

Hi Loic

I have tried different/computer and no Anti virus and problem still occurs. Currently trying to build a small app that demonstrates the error - but this is proving time consuming/difficult.

I have just noticed a file called "tesseract.log" that gets created in the app folder. This has the following entry when the problem occurs:

Error: Unable to open c:\documents and settings\all users\application data\mardak client\ocr\eng.DangAmbigs!
Tess copped out!
recog_word: Choices list len:0; blob lists len:6
recog_word: Added dummy choice list
recog_word: Added dummy choice list
recog_word: Added dummy choice list
recog_word: Added dummy choice list
recog_word: Added dummy choice list
recog_word: Added dummy choice list
Unable to load unicharset file c:\documents and settings\all users\application data\mardak client\ocr\eng.unicharset


Does this help through any light on the problem?

Cheers

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Fri Sep 18, 2009 1:51 pm

Hi Nigel,

This error is raised when a c fopen() function fail to open a file in read only mode.

Unfortunately I can't tell you more. Maybe your application is locking this file somewhere ?

Kind regards,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Wed Sep 23, 2009 5:48 pm

Hi Loic

No, we are not touching any of the ocr files.

I have created a test case application that reproduces this error and emailed it directly to your esupport address. I have run it on 3 different PC's and all 3 get the same error.

Let me know if you need any further information.

Thanks you

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Fri Sep 25, 2009 1:18 pm

Hi,

I think we solved your issue. We are making many extreme tests before validating the update...

Kind regards,

Loïc

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Fri Sep 25, 2009 4:15 pm

Hi Nigel,

Please update to latest edition available. Your problem should be resolved.

With best regards,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Tue Sep 29, 2009 6:25 pm

Hi Loic

Can you confirm that the new version has been posted on the web site.

I have uninstalled our old version and downloaded/installed the new version.

Now when I run it I do not get the dictionary error, instead the process just terminates when it gets to the same place as before - no message, it just disappears. This is running the same testcase both as an exe and from the IDE.

Our current version of gdpicturepro5s.ocx in Windows/System32 is 5.12.0.1.

Help!!

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Tue Sep 29, 2009 8:16 pm

I confirm.
Make sure you are using the gdocrplug.tesseract.dll version 1.1.0.6

I made tests on 6 computers all with success.

Kind regards,

Loïc

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Tue Sep 29, 2009 9:03 pm

Nigel,

Just a remark: you are not using the dictionary files provided by GdPicture.

eng.DangAmbigs
eng.freq-dawg
eng.inttemp
eng.normproto
eng.pffmtable
eng.unicharset
eng.user-words
eng.word-dawg




I don't know where you got they but with our files there is no crash with the latest release. Therefore, I can suggest you to use our files.

Kind regards,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Thu Oct 01, 2009 11:44 am

Hi Loic

Yes, that seems to have fixed it.

From reading the documentation on the google site, I was under the impression that the eng.user-words file could just be replaced to add additional words to the dictionary. The one supplied with gdpicture does have quite a few "suspect" words in it (IE misspelt words (IE Decibles), words that do not really exist in the English language (IE Dicators), etc). This is why I replaced this file with a "more recognised" list.

Anyway, the good news is that it works now.

So again, thank you.

Nigel

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Error : Can not find dictionnary

Post by Loïc » Thu Oct 01, 2009 11:56 am

Perfect Nigel !

Thank you for the feedback.

You are right for the dictionary modification. But I suspect a bad manipulation from your side because there was many dict files modified.
If having new words is a big necessity for you, I can suggest to add they one by one and doing some tests for validation.

With best regards,

Loïc

Nigels
Posts: 28
Joined: Tue Jul 08, 2008 4:23 pm

Re: Error : Can not find dictionnary

Post by Nigels » Thu Oct 01, 2009 3:02 pm

Hi Loic
But I suspect a bad manipulation from your side because there was many dict files modified.
Not really important because it is all working now , but I have just diff'd the ocr files I sent with the original testcase with the ones in the last update to gdpicture and they are all identical with the exception of the "eng.user-words" file.

The files I checked are:
eng.DangAmbigs
eng.freq-dawg
eng.inttemp
eng.normproto
eng.pffmtable
eng.user-words
eng.word-dawg

Could it be something to do with the size of the "eng.user-words" file?

As I said, this is not really important now because it is working with the original file.

Thanks again

Nigel

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests