Page 1 of 1

PDF + OCR not PDF/A

Posted: Tue Jul 06, 2010 1:20 pm
by rromeijn
How can i save the image as a plain PDF with OCR
this sample code

Code: Select all

Imaging1.CreateImageFromFile ("image.tif")
Imaging1.SaveAsPDFOCR("output.pdf", TesseractDictionaryEnglish, App.Path & "\AppData")  'AppData includes dictionary files
Imaging1.CloseNativeImage
saves the image as PDF/A + OCR
i want plain PDF + OCR

Re: PDF + OCR not PDF/A

Posted: Mon Aug 16, 2010 11:24 am
by rromeijn
a lot of views, but not 1 reply.

Re: PDF + OCR not PDF/A

Posted: Mon Aug 16, 2010 3:22 pm
by eagleman
@rromeijn

I do the following:

imageID = Imaging1.CreateGdPictureImageFromFile("00000001.JPG");
iPdfId = Imaging1.TwainPdfStart("00000001.PDF", true, "", "", "", "", "");
Imaging1.TwainAddGdPictureImageToPdf(iPdfId, imageID);
Imaging1.TwainPdfStop(iPdfId);
Imaging1.ReleaseGdPictureImage(imageID);


Good luck.

Eagleman

Re: PDF + OCR not PDF/A

Posted: Mon Aug 16, 2010 3:25 pm
by rromeijn
thanks,

but that saves the image as a PDF without OCR
I need PDF with OCR, but not PDF/A with OCR

Re: PDF + OCR not PDF/A

Posted: Mon Aug 16, 2010 3:31 pm
by Loïc
Hi,

This option is not available.
Why PDF/A is a problem for you ? PDF/A is certified 100% PDF compliant.

A workaround consists to remove the PDF/A flag replacing the header information "%âãÏÓ" by " " in the generated PDF. But there is no sense to do that as my humble opinion...

Kind regards,

Loïc

Re: PDF + OCR not PDF/A

Posted: Mon Aug 16, 2010 8:58 pm
by eagleman
To do OCR on image and save as PDF:

imageID = Imaging1.CreateGdPictureImageFromFile("00000001.JPG");
iPdfId = Imaging1.PdfOCRStart("00000001.PDF", true, "", "", "", "", "");
Imaging1.PdfAddGdPictureImageToPdfOCR(iPdfId
, imageID
, GdPicture.TesseractDictionary.TesseractDictionaryDutch
, Application.StartupPath.ToString() + "\\OCR"
, "");
Imaging1.PdfOCRStop(iPdfId);
Imaging1.ReleaseGdPictureImage(imageID);


Eagleman

Note: According to the manual, the 2nd parameter of PdfOCRStart (boolean): True to generate PDF in PDF/A format else False.

Re: PDF + OCR not PDF/A

Posted: Tue Aug 17, 2010 8:30 am
by rromeijn
Eagleman,

according to my manual this function doesnt even exist.

Re: PDF + OCR not PDF/A

Posted: Tue Aug 17, 2010 8:34 am
by rromeijn
Loic,

as you know, there are several restrictions to the PDF/A format that are not there in PDF(1.3)
(hyperlinks are not allowed)
I also have a customer who can only display PDF up to version 1.3 in his (expensive) software.

I will lookin to the option you described, but an option to save plain PDF would be nice.

Re: PDF + OCR not PDF/A

Posted: Tue Aug 17, 2010 5:08 pm
by eagleman
@rromeijn,

Make sure you have the latest manual. Although the manual does not show any version number, its name = "GdPicture_NET Document Imaging SDK.pdf" and is about 7.1 MB.

The function I mentioned does exist. Try the code I wrote earlier.

Succes.

Groet,
Eagleman