Dictionary causing crash

Discussions about machine vision support in GdPicture.
Post Reply
heard
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Dictionary causing crash

Post by heard » Fri Feb 05, 2010 11:46 pm

Hi Loic,
I have a client that uses an ocr process for hundreds of thousands of pages of scanned documents. The process will run sometimes for days and days and then suddenly crash. Then when I start it again, it might ocr another 6000 pages and then crash again. When it crashes, it kills the processes on the workstation in such a way that my application disappears.

I have been tracing this issue for quite some time and I have it narrowed down to the eng.user-words file. I have finally been able to reproduce the error consistently at my client's site. I have attached a process monitor trace file if that is of any help. Each time this crashes, I get the same entries in the trace file. Look at line 20406.

If I use an empty eng.user-words file, it doesn't crash.

Can you tell me what this file actually does? Do I need it?'

Any help is greatly appreciated.

Regards,
Heard
Attachments
Logfile.zip
(155.58 KiB) Downloaded 420 times

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Dictionary causing crash

Post by Loïc » Sat Feb 06, 2010 1:11 pm

Hi Heard,

Hard to help you in this issue.

Are you using the dict. files provided by GdPicture ?

To help you more I need to be able to reproduce your error. Therefore I need the code you are using, some info on the system configuration and the processed document.

With best regards,

Loïc

heard
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Re: Dictionary causing crash

Post by heard » Mon Feb 08, 2010 4:38 pm

Loic,
Yes, I am using the standard dictionaries. Can you tell me what the user-words file is used for?

Thanks,
Heard

heard
Posts: 78
Joined: Wed Jan 02, 2008 11:55 pm

Re: Dictionary causing crash

Post by heard » Wed Feb 10, 2010 10:20 pm

Hi Loic,
I'll post this in hopes that it might help someone else.

I cannot get this to consistently fail at my office, but it fails every time at two of my clients while processing the same page. I have tried to duplicate their environment as closely as possible but I don't have the same machine processor, video card, etc. as they do. Since I cannot reproduce it here, I cannot send you anything that I know will produce errors for you.

I don't know what the eng.user-words file does, but researching tesseract I learned this file is usually blank. Creating a blank eng.user-words file solves my problem at both clients. I have checked and am quite sure I was using the file provided by you when the crash occurs. Also, this does not seem to have affected the ocr results but if anyone knows otherwise, I would like to hear about it.

Thanks,
Heard

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Dictionary causing crash

Post by Loïc » Fri Feb 12, 2010 4:10 pm

Hi Heard,

You are right, this file is usually blank. I will make investigation to know if we must empty it for our next release.

Cheers,

Loïc

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest