I have been using the code below to OCR PDF image files successfully for a while, but recently had to change envirnoments, and now I no longer get any text in the resulting PDF/A. Each page seems to get read and rendered, but the PdfAddGdPictureImageToPdfOCR statement does not seem to do anything.
Two questions:
Is there a way to tell if PdfAddGdPictureImageToPdfOCR is erroring (like a GetStat statement)?
What might cause PdfAddGdPictureImageToPdfOCR to fail silently? would (for example) having a missing dictionary cause an error?
Thanks,
Leo
Code: Select all
Dict = "eng"
PdfID = oGdPictureImaging.PdfOCRStart(OutputFilePath, True, "", "", "", "", "DocDigester")
oGdPictureImaging.OCRTesseractSetPassCount(2)
If InputPDF.LoadFromFile(pdfPath, False) = GdPicture.GdPictureStatus.OK Then
node.GetProperties().Define("GdP_PDF_Pages", InputPDF.GetPageCount())
For i As Integer = 1 To InputPDF.GetPageCount()
node.GetProperties().Define("Done Reading Page" & i, "True")
InputPDF.SelectPage(i)
ImageID = InputPDF.RenderPageToGdPictureImageEx(200, True)
Dim pgText As String = oGdPictureImaging.PdfAddGdPictureImageToPdfOCR(PdfID, ImageID, Dict, sciroot & "apps\bin\win", "")
oGdPictureImaging.ReleaseGdPictureImage(ImageID)
Next i
Else
'report out reason for problem.
Dim errCode As Integer = InputPDF.GetStat()
node.GetProperties().Define("Error", pdfPath & "GdPicturePDF LoadFromFile Status not OK. ErrCode = " & errCode)
End If
InputPDF.CloseDocument()
oGdPictureImaging.PdfOCRStop(PdfID)