Generate searchable PDF from scanner, images or existing PDF

Example requests & Code samples for GdPicture Toolkits.
Post Reply
dchillman
Posts: 2
Joined: Wed Jan 21, 2009 9:03 pm

Generate searchable PDF from scanner, images or existing PDF

Post by dchillman » Mon Jan 26, 2009 3:34 pm

in the list of features for GDPicure.Net is "Generate searchable PDF from scanner, images or existing PDF files". I am interested particularly in the ability to open existing pdf documents and processing it to create a searchable pdf document. After examining the on-line documentation it wasn't obvious to me how this would be done. Do you have an example? Also if it is possible, can the source be a stream rather than a file? thanks Dan

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Generate searchable PDF from scanner, images or existing

Post by Loïc » Mon Feb 02, 2009 6:25 pm

- Sample 1: Creating multipage searchable PDF (PDF/A 1.4) from the content of the document feeder of a scanner:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Creating multipage searchable PDF (PDF/A 1.4) from the content of the document feeder of a scanner



- Sample 2: Creating multipage searchable PDF (PDF/A 1.4) from a single or a multipage image document:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Multipage searchable PDF (PDF/A 1.4) from an image file (including multipage)



- Sample 3: Creating single page searchable PDF (PDF/A 1.4) from image:

Code: Select all

-- information has been moved to documentation and is now available in both VB.NET and C# --
see sample #2


- Sample 4: Creating multipage searchable PDF (PDF/A 1.4) from existing multipage PDF:

Code: Select all

]-- information has been moved to documentation and is now available in both VB.NET and C# --
see here: Multipage searchable PDF from existing multipage PDF

jloizagah
Posts: 29
Joined: Tue Mar 17, 2009 2:45 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by jloizagah » Wed May 27, 2009 5:37 pm

Hi...

I'm trying to use the example: creating multipage searchable PDF (PDF/A 1.4) from existing multipage PDF, but when I copy the code, seems that my object oGdPictureImaging has not the methods PdfOCRStart, PdfAddGdPictureImageToPdfOCR and PdfOCRStop. There is a version problem or something like thath...

Best regards...

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Loïc » Tue Jun 30, 2009 11:47 am

Hi,

You need to download the latest edition: https://www.gdpicture.com/download/downl ... urenet.php

Kind regards,

Loïc

mirkop
Posts: 41
Joined: Wed Jun 24, 2009 5:38 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by mirkop » Tue Dec 01, 2009 5:44 pm

Hi Loic,

Could you post a sample about creating multipage searchable PDF from existing (multipage) PDF, without using GdViewer object. I need creating the file without preview.

Mirko

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Loïc » Tue Dec 01, 2009 5:53 pm

Hi Mirko,

The code i gave don't generate any preview. You can use within a simple function in a formless application.


kind regards,

Loïc

mirkop
Posts: 41
Joined: Wed Jun 24, 2009 5:38 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by mirkop » Tue Feb 23, 2010 2:05 pm

Hi,

I'm using your sample code for creating a pdfseacheable from an existing pdf and it works.
I downloaded the latest version of gdpicture.

My application monitoring a folder and convert the pdf files to pdf searchable. In this folder there are pdf searchable and not searchable.
But, many times the new pdf has a size major then the original pdf. It's happen if the original file is a pdf searchable.

I send you to esupport@gdpicture.com the pdf file.

Mirko

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Loïc » Tue Feb 23, 2010 2:51 pm

Hi Mirko,

It is a normal behavior. With this method a fully new PDF is created using raster bitmap. If the input PDF has page composed with only bitmaps you can have a small superior or inferior resulting file size.
However, if the input document is composed with vector objects (shapes & text) the resulting PDF will have in many case larger size. Usually this kind of PDF don't need to be processed because text is already embedded into the document.

What I can suggest you is to try to extract text of the original PDF using the GdViewer object. If the PDF contains text you should no process it. if there is no text inside, you can perform OCR.

Let me know if I am not clear enough.

With best regards,

Loïc Carrère

mirkop
Posts: 41
Joined: Wed Jun 24, 2009 5:38 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by mirkop » Tue Feb 23, 2010 3:22 pm

Hi,

It's clear .. i'll use your suggestion.

How can extract the text from the pdf , using PdfGetPageText()?

Mirko

User avatar
Loïc
Site Admin
Posts: 5881
Joined: Tue Oct 17, 2006 10:48 pm
Location: France
Contact:

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Loïc » Tue Feb 23, 2010 3:23 pm

How can extract the text from the pdf , using PdfGetPageText()?
Yes, it is the faster way.

Kind regards,

Loïc

mirkop
Posts: 41
Joined: Wed Jun 24, 2009 5:38 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by mirkop » Tue Feb 23, 2010 3:32 pm

Hi Loic,

Thank you for your reply .. it works fine.

Cedric
Posts: 269
Joined: Sun Sep 02, 2012 7:30 pm

Re: Generate searchable PDF from scanner, images or existing

Post by Cedric » Mon May 12, 2014 12:40 pm

Code snippet has been moved and updated, they are now available in both C# and VB.NET in our documentation as you can see here.

Gabriela
Posts: 436
Joined: Wed Nov 22, 2017 9:52 am

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Gabriela » Wed Nov 22, 2017 11:55 am

Here are updated code snippets demonstrating how to create/convert documents based on GdPicture.NET 14:

Creating a searchable PDF document from an existing scanned PDF document
https://www.gdpicture.com/guides/gdpict ... ment.html

Creating a searchable PDF (PDF/A) document from an image file (both single and multi-page TIFF image)
https://www.gdpicture.com/guides/gdpict ... age).html

Creating a searchable PDF (PDF/A) document from the content of the document feeder of a scanner
https://www.gdpicture.com/guides/gdpict ... nner.html

Converting a TIFF image to a searchable PDF document using multithreading
https://www.gdpicture.com/guides/gdpict ... ding.html

Salardar
Posts: 2
Joined: Sun Apr 17, 2022 9:14 pm

Re: Generate searchable PDF from scanner, images or existing PDF

Post by Salardar » Sun Apr 17, 2022 9:16 pm

Thank you very much for the material.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest