OCR on Images from Camera with Codes
OCR on Images from Camera with Codes
hi there,
we're using gdpicture tessaract ocr engine since a few months. now i have some problems with correct recognition.
We make a picture of a printed image. For example the attached "Full Picture (e_original.png)". We have this as bitmap file. I'm able to finde the "field" for the OCR Rectangle. So i do not have to scan the whole image. I can use setROI and reduce the image to the text only. you can see this in the ocrtest.png (Prepared for OCR).
Now I tried the following ways to get a correct recognition:
Way 1:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.OCRTesseractDoOCR: Tried with German and English Dictionary. With and without whitelist chars. Please note: We now how the text is build: So we can say, the first 3 chars are text, the next 3 are numbers, the next 1 is only the letter S or W and so one...
Problem with this:
a) if i do not enter a whitelist and use german dictionary, i get a result like: 35 l47XO4CE l (correct would be: 35147X04CE1)
b) if i do not enter a whitelist and use english dicttionary, i get a result like: EEWDKSO (correct would be: 35147X04CE1)
c) if i do enter a whitelist like "1234567890XYZABCDE" i get a complete wrong result like in point b)
Way 2:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.OCRTesseractDoOCR and count chars
6. do OCRTesseractDoOCR again for eacht grouped letters. With the result of point 5 i can see, where i have to set the ROI for the first 5 letters. I then set the ROI for these 5 letters and to the ocr again with the whitelist "1234567890". I then get a very good recognition result!
Problem with this:
a) sometimes the OCR does not recognize the correct amount of letters or the wrong amount and then the grouping doesn't realy work
Way 3:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.ConvertTo1Bpp(index, 150);
6. _gdPictureImaging.Crop to the text field size
7. _gdPictureImaging.Resize the picture to 25% of the original
8. _gdPictureImaging.ResetROI()
9. _gdPictureImaging.OCRTesseractDoOCR (tried with and without whitelist)
Problem with this:
-> the same as in way 1
Now i don't know if i do something wrong or maybe someone gang give me some hints what I have to do?
Best regards
Marco
we're using gdpicture tessaract ocr engine since a few months. now i have some problems with correct recognition.
We make a picture of a printed image. For example the attached "Full Picture (e_original.png)". We have this as bitmap file. I'm able to finde the "field" for the OCR Rectangle. So i do not have to scan the whole image. I can use setROI and reduce the image to the text only. you can see this in the ocrtest.png (Prepared for OCR).
Now I tried the following ways to get a correct recognition:
Way 1:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.OCRTesseractDoOCR: Tried with German and English Dictionary. With and without whitelist chars. Please note: We now how the text is build: So we can say, the first 3 chars are text, the next 3 are numbers, the next 1 is only the letter S or W and so one...
Problem with this:
a) if i do not enter a whitelist and use german dictionary, i get a result like: 35 l47XO4CE l (correct would be: 35147X04CE1)
b) if i do not enter a whitelist and use english dicttionary, i get a result like: EEWDKSO (correct would be: 35147X04CE1)
c) if i do enter a whitelist like "1234567890XYZABCDE" i get a complete wrong result like in point b)
Way 2:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.OCRTesseractDoOCR and count chars
6. do OCRTesseractDoOCR again for eacht grouped letters. With the result of point 5 i can see, where i have to set the ROI for the first 5 letters. I then set the ROI for these 5 letters and to the ocr again with the whitelist "1234567890". I then get a very good recognition result!
Problem with this:
a) sometimes the OCR does not recognize the correct amount of letters or the wrong amount and then the grouping doesn't realy work
Way 3:
1. Take the original pciture (e_original.png)
2. _gdPictureImaging.SetROI to the text field at the bottom of the image
3. _gdPictureImaging.OCRTesseractReinit()
4. _gdPictureImaging.OCRTesseractSetPassCount(2)
5. _gdPictureImaging.ConvertTo1Bpp(index, 150);
6. _gdPictureImaging.Crop to the text field size
7. _gdPictureImaging.Resize the picture to 25% of the original
8. _gdPictureImaging.ResetROI()
9. _gdPictureImaging.OCRTesseractDoOCR (tried with and without whitelist)
Problem with this:
-> the same as in way 1
Now i don't know if i do something wrong or maybe someone gang give me some hints what I have to do?
Best regards
Marco
Re: OCR on Images from Camera with Codes
Hi,
I have been able to get the correct result with some image processing:
ConvertTo1Bpp(ImageID, 100) 'Threshold
FxBitonalDilate4(ImageID) 'Dilate characters
SetROI
OCR
Hope this helps !
Loïc
I have been able to get the correct result with some image processing:
ConvertTo1Bpp(ImageID, 100) 'Threshold
FxBitonalDilate4(ImageID) 'Dilate characters
SetROI
OCR
Hope this helps !
Loïc
Re: OCR on Images from Camera with Codes
Great! Absolutely perfect! It works now
Thank you very much.
Thank you very much.
Re: OCR on Images from Camera with Codes
Hi,
I am glad this is working fine. Thank you for the return.
Cheers,
Loïc
I am glad this is working fine. Thank you for the return.
Cheers,
Loïc
Who is online
Users browsing this forum: No registered users and 1 guest