Page 1 of 1
PDF + OCR not PDF/A
Posted: Tue Jul 06, 2010 1:20 pm
by rromeijn
How can i save the image as a plain PDF with OCR
this sample code
Code: Select all
Imaging1.CreateImageFromFile ("image.tif")
Imaging1.SaveAsPDFOCR("output.pdf", TesseractDictionaryEnglish, App.Path & "\AppData") 'AppData includes dictionary files
Imaging1.CloseNativeImage
saves the image as PDF/A + OCR
i want plain PDF + OCR
Re: PDF + OCR not PDF/A
Posted: Mon Aug 16, 2010 11:24 am
by rromeijn
a lot of views, but not 1 reply.
Re: PDF + OCR not PDF/A
Posted: Mon Aug 16, 2010 3:22 pm
by eagleman
@rromeijn
I do the following:
imageID = Imaging1.CreateGdPictureImageFromFile("00000001.JPG");
iPdfId = Imaging1.TwainPdfStart("00000001.PDF", true, "", "", "", "", "");
Imaging1.TwainAddGdPictureImageToPdf(iPdfId, imageID);
Imaging1.TwainPdfStop(iPdfId);
Imaging1.ReleaseGdPictureImage(imageID);
Good luck.
Eagleman
Re: PDF + OCR not PDF/A
Posted: Mon Aug 16, 2010 3:25 pm
by rromeijn
thanks,
but that saves the image as a PDF without OCR
I need PDF with OCR, but not PDF/A with OCR
Re: PDF + OCR not PDF/A
Posted: Mon Aug 16, 2010 3:31 pm
by Loïc
Hi,
This option is not available.
Why PDF/A is a problem for you ? PDF/A is certified 100% PDF compliant.
A workaround consists to remove the PDF/A flag replacing the header information "%âãÏÓ" by " " in the generated PDF. But there is no sense to do that as my humble opinion...
Kind regards,
Loïc
Re: PDF + OCR not PDF/A
Posted: Mon Aug 16, 2010 8:58 pm
by eagleman
To do OCR on image and save as PDF:
imageID = Imaging1.CreateGdPictureImageFromFile("00000001.JPG");
iPdfId = Imaging1.PdfOCRStart("00000001.PDF", true, "", "", "", "", "");
Imaging1.PdfAddGdPictureImageToPdfOCR(iPdfId
, imageID
, GdPicture.TesseractDictionary.TesseractDictionaryDutch
, Application.StartupPath.ToString() + "\\OCR"
, "");
Imaging1.PdfOCRStop(iPdfId);
Imaging1.ReleaseGdPictureImage(imageID);
Eagleman
Note: According to the manual, the 2nd parameter of PdfOCRStart (boolean): True to generate PDF in PDF/A format else False.
Re: PDF + OCR not PDF/A
Posted: Tue Aug 17, 2010 8:30 am
by rromeijn
Eagleman,
according to my manual this function doesnt even exist.
Re: PDF + OCR not PDF/A
Posted: Tue Aug 17, 2010 8:34 am
by rromeijn
Loic,
as you know, there are several restrictions to the PDF/A format that are not there in PDF(1.3)
(hyperlinks are not allowed)
I also have a customer who can only display PDF up to version 1.3 in his (expensive) software.
I will lookin to the option you described, but an option to save plain PDF would be nice.
Re: PDF + OCR not PDF/A
Posted: Tue Aug 17, 2010 5:08 pm
by eagleman
@rromeijn,
Make sure you have the latest manual. Although the manual does not show any version number, its name = "GdPicture_NET Document Imaging SDK.pdf" and is about 7.1 MB.
The function I mentioned does exist. Try the code I wrote earlier.
Succes.
Groet,
Eagleman