OCR Individual Pages
-
- Posts: 32
- Joined: Sun Jan 30, 2011 8:40 pm
OCR Individual Pages
Hello,
I need to ocr individual pages of tif files but I can't figure out an easy way to do that. Basically, all I am doing is looping through each page of a document, ocr'ing the page and then storing the text in a database. I need to go page-by-page in order to show progress.
The problem is that the OCRTesseractDoOCR method ocr's an entire GD Picture image, so it appears that I could use that if I could load individual pages of a document into a GDPictureImage object. I can't figure out how to do that though. By the way, the images do not need to be displayed. This will all be done behind the scenes, minus the progress information.
Thanks,
Reagan
I need to ocr individual pages of tif files but I can't figure out an easy way to do that. Basically, all I am doing is looping through each page of a document, ocr'ing the page and then storing the text in a database. I need to go page-by-page in order to show progress.
The problem is that the OCRTesseractDoOCR method ocr's an entire GD Picture image, so it appears that I could use that if I could load individual pages of a document into a GDPictureImage object. I can't figure out how to do that though. By the way, the images do not need to be displayed. This will all be done behind the scenes, minus the progress information.
Thanks,
Reagan
Re: OCR Individual Pages
Hello Reagan,
Do you mean you want to OCR a multipage TIFF image ?
Regards,
Loïc
Do you mean you want to OCR a multipage TIFF image ?
Regards,
Loïc
-
- Posts: 32
- Joined: Sun Jan 30, 2011 8:40 pm
Re: OCR Individual Pages
Yes. But I would like to be able to do one page at a time so that I can show progress for it.
Thanks,
Reagan
Thanks,
Reagan
Re: OCR Individual Pages
Hello,
ok it' easy to do:
1- Open the image
2- Select the desired page by using the TiffSelectPage() method
3- Run the ocr process
repeat 2-3 for each page of your file.
Let me know if I am not clear enough.
Kind regards,
Loïc
ok it' easy to do:
1- Open the image
2- Select the desired page by using the TiffSelectPage() method
3- Run the ocr process
repeat 2-3 for each page of your file.
Let me know if I am not clear enough.
Kind regards,
Loïc
-
- Posts: 32
- Joined: Sun Jan 30, 2011 8:40 pm
Re: OCR Individual Pages
Thanks. That worked.
Reagan
Reagan
Re: OCR Individual Pages
Hi Loïc,
I read your hint about OCR a Tiff multipage file, but I'm encountering some problems. I try to explain you.
I'm using the sample C# project installed in GdViewerSamplesv8\OCR\ with some changes.
I open a Tiff multipage, then I loop on the pages and I call OCR on each page.
This is the code:
At the end of procedure in my string sOCR I have the text of the first page of file repeating for three times (because my tiff file has three pages).
I tried to use the property TiffOpenMultiPageForWrite, but nothing changes.
The only way to have the purposed result is to use
instead of
The problem to use this method is that sometimes I don't have a filename but I have a stream, so I use the method gdPicture.CreateGdPictureImageFromStream(binaryContent).
I'm probably doing something wrong.
Can you help me?
Thank you in advance.
Michela
P.S. I'm using GdPicture v. 8.3.
I read your hint about OCR a Tiff multipage file, but I'm encountering some problems. I try to explain you.
I'm using the sample C# project installed in GdViewerSamplesv8\OCR\ with some changes.
I open a Tiff multipage, then I loop on the pages and I call OCR on each page.
This is the code:
Code: Select all
// opens the file
int m_ImageID = oGdPictureImaging.CreateGdPictureImageFromFile(fileName);
string sOCR = string.Empty;
// loop pages
if (oGdPictureImaging.TiffIsMultiPage(m_ImageID))
{
int pageCount = oGdPictureImaging.TiffGetPageCount(m_ImageID);
for (int i = 1; i <= pageCount; i++)
{
if (i > 1)
oGdPictureImaging.TiffSelectPage(m_ImageID, i);
oGdPictureImaging.Scale(m_ImageID, 300, System.Drawing.Drawing2D.InterpolationMode.HighQualityBicubic);
oGdPictureImaging.OCRTesseractReinit();
sOCR += oGdPictureImaging.OCRTesseractDoOCR(m_ImageID, txtLang.Text, TextBox1.Text, string.Empty);
oGdPictureImaging.OCRTesseractClear();
}
}
I tried to use the property TiffOpenMultiPageForWrite, but nothing changes.
The only way to have the purposed result is to use
Code: Select all
m_ImageID = oGdPictureImaging.TiffCreateMultiPageFromFile(fileName);
Code: Select all
m_ImageID = oGdPictureImaging.CreateGdPictureImageFromFile(fileName);
I'm probably doing something wrong.
Can you help me?
Thank you in advance.
Michela
P.S. I'm using GdPicture v. 8.3.
Re: OCR Individual Pages
Hello,
First I suggest you to upgrade to latest 8.X edition. To get the download link, please create a ticket here: https://www.gdpicture.com/support/getting-support-from-our-team
Also, have you tried to replace CreateGdPictureImageFromFile by TiffCreateMultipageFromFile() method?
Kind regards,
Loïc
First I suggest you to upgrade to latest 8.X edition. To get the download link, please create a ticket here: https://www.gdpicture.com/support/getting-support-from-our-team
Also, have you tried to replace CreateGdPictureImageFromFile by TiffCreateMultipageFromFile() method?
Kind regards,
Loïc
Re: OCR Individual Pages
Yes, using the method TiffCreateMultiPageFromFile() I get the expected behaviour.
Thanks a lot.
Michela
Thanks a lot.
Michela
Who is online
Users browsing this forum: No registered users and 1 guest