Page 1 of 2

[Demo App] Multi-thread Tiff to PDF/OCR

Posted: Fri Nov 11, 2011 12:20 pm
by Loïc
-Edit-

Here is a new version (vb.net & c#) based on built-in multitasking support of GdPicture 11.
TIFF to PDF-OCR.zip
(340.24 KiB) Downloaded 962 times
Hi there,

Based on many customer requests we provide a vb.net demo application which aims to convert multipage Tiff document to PDF/OCR using a predefined number of threads.

The app has been created using Visual Studio 2010 (vb language).

Application behavior:
- Expects for user to provide a multipage tiff to convert to PDF/OCR, valid dictionary path and language (default is english)
- Splits the input tiff document in several tiffs (1 file = 1 page)
- Performs OCR in multi-thread mode. 1 page = 1 thread. And create 1 PDF per page
- When OCR is done, the app merges the produced PDFs to a single PDF.

Prerequisites:
- Visual Studio 2010 or higher.
- Install GdPicture.NET 8.4.3 or higher.
- Open the app and replace "XXX" by a valid trial or commercial key.

mtocr.png
TIFF to PDF/OCR multit-hread application screenshot.

Feel free to post any question or comment.

Kind regards,

Loïc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Fri Nov 11, 2011 6:40 pm
by JacobRusso
Hi Loic,

Thank you very much. This looks fantastic. Just one question. I am a bit confused with the license numbers. In your example, you use "oLicenseManager.RegisterKEY"... but I received TWO keys, one for GdPicture Image and another for the Tesseract Add-On. What is the correct way to set my license?

Thanks,
Jacob

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Fri Nov 11, 2011 6:53 pm
by Loïc
Hi Jacob,

Just call the Register key for each of your license. No matter the order.

Ie:

oLicenseManager.RegisterKey(LIC1)
oLicenseManager.RegisterKey(LIC2)

Cheers!

Loïc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Fri Nov 11, 2011 8:01 pm
by JacobRusso
Thank you again that works!
I did not purchase the PDF Add-On. When it tries the "oGdPicturePDF.MergeDocuments(files, fileDest)" I get a message that I'm not licensed for the PDF Add-On. What would be the cleanest way to do this without the Add-On?

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Sun Nov 13, 2011 5:52 pm
by Loïc
Hi Jacob,

unfortunately there is no other way than using the GdPicture PDF plugin to get this sample working. I am sorry, I forgot to specify that.

Kind regards,

Loîc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Wed Feb 22, 2012 10:12 am
by rens012
Hi,

Is there a way to create a pdf/a file with the oGdPicturePDF.MergeDocuments() method?


Thanks,

Rens

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Wed Feb 22, 2012 3:57 pm
by Loïc
Hello,

In the next minor release MergeDocuments() will generate PDF/A according to the input documents. I will upload soon a modified version of the Demo for demonstrating example of usage.

Kind regards,

Loïc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Feb 23, 2012 6:14 pm
by Loïc
Hello,

Please find attached the version that supports PDF/A as output. To be used with GdPicture.NET 8.5.15 and higher.



Kind regards,

Loïc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Mar 08, 2012 1:33 am
by JacobRusso
Hi Loic,
When I exract and open this project, I am missing the "modGlobals.vb".
Also, in the previous version, "MultiPageOCRThreading.zip", I found that the pages were not being processed in the proper order. The problem was in the "cmdRun_Click" event. When storing the individual pages, the sort order for the files goes off track if there are more than 9 pages. For example;

page1.tif
page11.tif
page2.tif

I was able to correct this by modifying the "SaveAsTIFF" with "Format" statement as follows:

oGdPictureImaging.SaveAsTIFF(tiffID, tmp_path + "\page" + Format(i, "0000").ToString + ".tif", GdPicture.TiffCompression.TiffCompressionAUTO)

This way, they are sorted correctly as;
page0001.tif
page0002.tif
...
page0011.tif

Thanks,
Jacob

This way, the files are ordered as:

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Mar 08, 2012 5:22 am
by JacobRusso
Loïc,

One more thing. Do you have any experience with the new .NET Framework 4 "System.Threading.Tasks" namespace or TaskFactory class? It seems to be very powerful, and hopefully, easier to implement?

Imports System.Threading
Imports System.Threading.Tasks

Dim taskA = _
Task.Factory.StartNew(Sub

... multithreaded statements ...

End Sub)

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Mar 08, 2012 3:33 pm
by Loïc
Hello,

Please find attached the fixed version.

Kind regards,

Loïc

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Tue Jun 26, 2012 5:33 pm
by rom
Hi Loïc


I'm using de trial version in a delphi project


Can u provide tha same example in a delphi project?

tks

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Jul 05, 2012 4:56 pm
by rom
Hi, can u provide the same example using delphi?

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Thu Jul 05, 2012 4:58 pm
by Loïc
Hello Rom,

Unfortunately we have no competence in multithreading under Delphi.

Re: [Demo App] Multi-thread Tiff to PDF/OCR

Posted: Fri Nov 02, 2012 4:38 am
by sulfaroj
Hi,

Are there any updates to this demo app. I am looking at a way to OCR a PDF in a multithreaded or parallel processing way not a multipage TIFF. With .NET 4.0 is there any plan to handle OCR of a multipage document (PDF) in Parallel mode internally in the toolkit? Something like this would be a great feature.