Based on Google's open source Tesseract OCR engine, the GdPicture Tesseract Plugin adds OCR features to GdPicture ActiveX, such as text recognition on a specific area of an image and the ability to create searchable PDF/A files (PDF-OCR) from scanned documents, images or existing PDF documents.
- Unicode Support.
- Character recognition confidence.
- Retrieve character location.
- Output text.
- Support for PDF/A OCR generation (PDF Image + hidden searchable text).
- Multiple languages: English, French, Italian, German, Spanish, Brazilian Portuguese, Vietnamese, Polish and Dutch.
- Can recognize only digits, only alpha or only "white listed " characters.
- Fast area processing.
- Document orientation detection.
- Easy to use.
- Fast, accurate & bug free.
- Royalty-free licensing: no distribution license required for server or desktop.
Where can I use or evaluate the GdPicture Tesseract OCR Plugin for .NET?
The binaries of this plugin are included within the following Active X GdPicture SDKs:
GdPicture Pro Imaging SDK
GdPicture Light Imaging Toolkit
GdTwain Pro SDK
GdTwain ActiveX
By downloading one of these SDK you can use or evaluate the GdPicture Tesseract Plugin.
You can get a one month trial KEY here. You can also purchase licenses here.
The plugin will need to be unlocked, see "How can I unlock the GdPicture Tesseract OCR Plugin?".
How can I unlock the GdPicture Tesseract OCR Plugin?
Just call the SetLicenseNumberOCRTesseract() method passing your license KEY as parameter:
I.E., Object.SetLicenseNumberOCRTesseract("YourKey");What is the minimum text size to get reasonable accuracy?
The minimum text height is about 15-20 pixels. Below 15 pixels, accurate results decrease dramatically.
How do I perform OCR on a specific zone of an image?
1- Load the image (see your toolkit reference guide).
2- Define the zone (also called region of interest) using the SetROI() method.
I.E., Object.SetRoi(100,100,250,50)
3- Perform the OCR using the OCRTesseractDoOCR() method.
How do I build searchable PDF/A files (PDF-OCR) from multi-page TIFF images, PDFs or scanned documents?
Using GdPicture ActiveX editions: click here
How do I make a custom dictionary to increase the recognition of specific words?
For an English dictionary: edit eng.user-words, then add your own words in UTF8 format, one word per line, sorted alphabetically.
For a French dictionary, edit fra.user-words, then add your own words in UTF8 format, one word per line, sorted alphabetically.
OCR Tesseract Plugin v.1 for for ActiveX
1.1.6 (25 September 2009)
| License | Price/License (USD) | |
|---|---|---|
| 1 Developer | 499.00 | |
| 2 Developers | 638.72 | package |
| 3 Developers | 958.08 | package |
| 4 Developers | 1277.44 | package |
| 5 Developers | 1596.80 | package |
| Site license | 1999.00 | site |
All licenses include royalty-free distribution with your application or system.
License price is the same for server- or desktop- deployment, we do not charge extra fee for server deployment.
Software license key will continue to be valid for all future 1.X versions of GdPicture Tesseract Plugin with free upgrades.
ActiveX licensing
Per developer licenses: This license type entitles the specified number of developer (1,2,3,4,5) within a single organization to write software with access to GdPicture Pro Imaging SDK.
Site license: This license entitles an unlimited number of developers in a single organization, at a single physical address, to write software with access to GdPicture Pro Imaging SDK.
