Page 1 of 1

Generate searchable PDF from scanner, images or existing PDF

Posted: Mon Jan 26, 2009 3:34 pm
by dchillman
in the list of features for GDPicure.Net is "Generate searchable PDF from scanner, images or existing PDF files". I am interested particularly in the ability to open existing pdf documents and processing it to create a searchable pdf document. After examining the on-line documentation it wasn't obvious to me how this would be done. Do you have an example? Also if it is possible, can the source be a stream rather than a file? thanks Dan

Re: Generate searchable PDF from scanner, images or existing

Posted: Mon Feb 02, 2009 6:25 pm
by Loïc
- Sample 1: Creating multipage searchable PDF (PDF/A 1.4) from the content of the document feeder of a scanner:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Creating multipage searchable PDF (PDF/A 1.4) from the content of the document feeder of a scanner



- Sample 2: Creating multipage searchable PDF (PDF/A 1.4) from a single or a multipage image document:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Multipage searchable PDF (PDF/A 1.4) from an image file (including multipage)



- Sample 3: Creating single page searchable PDF (PDF/A 1.4) from image:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see sample #2


- Sample 4: Creating multipage searchable PDF (PDF/A 1.4) from existing multipage PDF:

Code: Select all

]-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Multipage searchable PDF from existing multipage PDF

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Wed May 27, 2009 5:37 pm
by jloizagah
Hi...

I'm trying to use the example: creating multipage searchable PDF (PDF/A 1.4) from existing multipage PDF, but when I copy the code, seems that my object oGdPictureImaging has not the methods PdfOCRStart, PdfAddGdPictureImageToPdfOCR and PdfOCRStop. There is a version problem or something like thath...

Best regards...

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Jun 30, 2009 11:47 am
by Loïc
Hi,

You need to download the latest edition: https://www.gdpicture.com/download/downl ... urenet.php

Kind regards,

Loïc

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Dec 01, 2009 5:44 pm
by mirkop
Hi Loic,

Could you post a sample about creating multipage searchable PDF from existing (multipage) PDF, without using GdViewer object. I need creating the file without preview.

Mirko

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Dec 01, 2009 5:53 pm
by Loïc
Hi Mirko,

The code i gave don't generate any preview. You can use within a simple function in a formless application.


kind regards,

Loïc

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Feb 23, 2010 2:05 pm
by mirkop
Hi,

I'm using your sample code for creating a pdfseacheable from an existing pdf and it works.
I downloaded the latest version of gdpicture.

My application monitoring a folder and convert the pdf files to pdf searchable. In this folder there are pdf searchable and not searchable.
But, many times the new pdf has a size major then the original pdf. It's happen if the original file is a pdf searchable.

I send you to esupport@gdpicture.com the pdf file.

Mirko

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Feb 23, 2010 2:51 pm
by Loïc
Hi Mirko,

It is a normal behavior. With this method a fully new PDF is created using raster bitmap. If the input PDF has page composed with only bitmaps you can have a small superior or inferior resulting file size.
However, if the input document is composed with vector objects (shapes & text) the resulting PDF will have in many case larger size. Usually this kind of PDF don't need to be processed because text is already embedded into the document.

What I can suggest you is to try to extract text of the original PDF using the GdViewer object. If the PDF contains text you should no process it. if there is no text inside, you can perform OCR.

Let me know if I am not clear enough.

With best regards,

Loïc Carrère

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Feb 23, 2010 3:22 pm
by mirkop
Hi,

It's clear .. i'll use your suggestion.

How can extract the text from the pdf , using PdfGetPageText()?

Mirko

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Feb 23, 2010 3:23 pm
by Loïc
How can extract the text from the pdf , using PdfGetPageText()?
Yes, it is the faster way.

Kind regards,

Loïc

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Tue Feb 23, 2010 3:32 pm
by mirkop
Hi Loic,

Thank you for your reply .. it works fine.

Re: Generate searchable PDF from scanner, images or existing

Posted: Mon May 12, 2014 12:40 pm
by Cedric
Code snippet has been moved and updated, they are now available in both C# and VB.NET in our documentation as you can see here.

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Wed Nov 22, 2017 11:55 am
by Gabriela
Here are updated code snippets demonstrating how to create/convert documents based on GdPicture.NET 14:

Creating a searchable PDF document from an existing scanned PDF document
https://www.gdpicture.com/guides/gdpict ... ment.html

Creating a searchable PDF (PDF/A) document from an image file (both single and multi-page TIFF image)
https://www.gdpicture.com/guides/gdpict ... age).html

Creating a searchable PDF (PDF/A) document from the content of the document feeder of a scanner
https://www.gdpicture.com/guides/gdpict ... nner.html

Converting a TIFF image to a searchable PDF document using multithreading
https://www.gdpicture.com/guides/gdpict ... ding.html

Re: Generate searchable PDF from scanner, images or existing PDF

Posted: Sun Apr 17, 2022 9:16 pm
by Salardar
Thank you very much for the material.