How to get accurate recognitions on letter o and number 0?

Discussions about machine vision support in GdPicture.
Post Reply
User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: How to get accurate recognitions on letter o and number 0?

Post by Loïc » Sun Nov 30, 2008 1:02 pm

Hi,

Could you send me this image at esupport (at) gdpicture (dot) com ?

I'll see if I can do something.

Best regards,

Loïc

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: How to get accurate recognitions on letter o and number 0?

Post by Loïc » Mon Dec 08, 2008 2:22 pm

Hi Guepin,

I alredy sent you an answer by mail. Maybe it it in your anti-spam box ?

Here a copy of my answer:

Hi,

Your text are written using a police which is not OCR friendly.

However, you can get better result converting the image in 1bpp before doing OCR process.

Also, you can call the FxDilate4 method before processing OCR. it can repair some characters.

Best regards,

Loïc
Loïc

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: How to get accurate recognitions on letter o and number 0?

Post by Loïc » Tue Dec 09, 2008 6:36 pm

Hi,

You have probably loaded the image within an Imaging object to performs OCR.

Before calling the OCR function you can call the ConverTo1Bpp() method.

Also, you can try to repair characters calling the FxDilate4 () (FxBitonalDilate4() for .NET).

IE:

Using ActiveX editions:

Code: Select all

Imaging1.SetNativeImage(GdViewer1.GetNativeImage())
Imaging1.ConvertTo1Bpp()
Imaging1.FxDilate4()
Using .NET editions:

Code: Select all

oGdPictureImaging.ConvertTo1Bpp(m_ImageID)
oGdPictureImaging.FxBitonalDilate4(m_ImageID)

Then do OCR here.

With best regards,

Loïc

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: How to get accurate recognitions on letter o and number 0?

Post by Loïc » Thu Jan 08, 2009 8:32 pm

Hi,

Sorry for the late but I thought this topic was closed. Please, next time start a new topic for a new question.

It seems to have fixed the o versus 0 problem. However, it introduced another problem: the letter e was now incorrectly OCRed to c.

Any ideas to fix the new problem?
Unfortunately not. OCR is not an exact science. You can try to optimize OCR accuracy applying some filters to your image such as character repair, erosion, dilation... See the "Bitonal Image enhancement function" section of the reference guide of the toolkit to get an exhaustive list of functions.
Can I use Gdpicture's pdf functions to do a search for the texts after completing the OCR process? If possible, please show me how. The more detailed the more better. I'd like to build an interface with VB where the user can search the texts in the pdf file.
This feature is not implemented into the GdPicture ActiveX editions.
The implementation is done for the .NET edition and should be released within 2 weeks.
However, we are looking for a way to introduce this feature in the ActiveX editions but today I can't promise a date of release, we encountering too many problem with the PDF engine we are using.

Best regards,

Loïc

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 1 guest