Using the code below does not return an error, but the output string is garbage. At first I thought it was the quality of the image (attached), but then I simply did an image capture of a pdf page and tried to scan it and that produced garbage as well. I have attached the image I am trying to scan. It is very poor quality. I have attached the code I am using as well, to make sure it isn't user error.
Here is the code I am using. I will basically run this in sort of a batch mode over dozens of .tif files, extract the text and work with the text later on in the code.
Code: Select all
GdPictureImaging oGdPictureImaging = new GdPictureImaging();
oGdPictureImaging.SetLicenseNumber("my key");
oGdPictureImaging.SetLicenseNumberOCRTesseract("my key");
int ImageId = oGdPictureImaging.CreateGdPictureImageFromFile(@"C:\projects\pdf conversion\OCR\3-5-2011 8-19-37 AM.png");
String output=oGdPictureImaging.OCRTesseractDoOCR(ImageId,TesseractDictionary.TesseractDictionaryEnglish,"C:/Program Files/GdPicture.NET/Redist/OCR/","");
Console.WriteLine(output);
Thanks,
Josef