Based on Google's open source Tesseract OCR V3 engine, the GdPicture OCR Tesseract Plugin adds OCR features to GdPicture.NET, such as text recognition on a specific area of an image and the ability to create searchable PDF/A files (PDF-OCR) from scanned documents, images or existing PDF documents.
GdPicture OCR Tesseract Plugin supports many languages (see the list in Main Features below) and can process more than 90 document formats.
Note: In order to OCR PDF files, the GdPicture Managed PDF plugin is required
- Full Unicode Support.
- Multi-thread Support (demo application included in the GdPicture.NET SDK package).
- Character recognition confidence.
- Retrieve character location.
- Output text.
- Support for PDF/A OCR generation (PDF Image + hidden searchable text).
- Can produce PDF & PDF/A with unicode characters with very small size.
- Support for near 40 languages such as English, French, Italian, German, Spanish, Brazilian Portuguese, Vietnamese, Chinese, Russian, Polish, Dutch, etc.
- Can recognize only digits, only alpha or only "white listed" characters.
- OCR context support. Defines if the engine is processing document, single word, single character, text block, vertical text etc...
- Fast area processing.
- Document orientation detection.
Where can I use or evaluate the GdPicture OCR Tesseract Plugin ?
The binaries of this plugin are included within the GdPicture.NET SDK
By downloading the SDK you can use or evaluate the GdPicture Tesseract Plugin.
You can get a one month trial KEY here. You can also purchase licenses here.
The plugin will need to be unlocked, see "How can I unlock the GdPicture Tesseract OCR Plugin?".
How can I unlock the GdPicture OCR Tesseract Plugin?
Just call the SetLicenseNumberOCRTesseract() method passing your license KEY as parameter:I.E., Object.SetLicenseNumberOCRTesseract("YourKey");
Where can I download all supported dictionary languages ?
You can download the OCR languages pack, including all supported languages from http://www.gdpicture.com/download/ocr_language_pack.zip
What is the minimum text size to get reasonable accuracy?
The minimum text height is about 15-20 pixels. Below 15 pixels, accurate results decrease dramatically.
How do I perform OCR on a specific zone of an image?
1- Load the image (see your toolkit reference guide).
2- Define the zone (also called region of interest) using the SetROI() method.
3- Perform the OCR using the OCRTesseractDoOCR() method.
How do I build searchable PDF/A files (PDF-OCR) from multi-page TIFF images, PDFs or scanned documents?
OCR zone of a PDF page
OCR a multipage tiff image
Generating searchable PDF from Scanner, Bitmap, or PDF
- You will be able to find some compiled demo applications in [Install directory]\samples\Bin\
- You will be able to find C# and VB.NET demo applications including source code in [Install directory]\samples\AnyCPU\
- You will find other code snippets within the online reference guide found here http://guides.gdpicture.com
- You can find some discussions about the GdPicture Tesseract OCR Plugin in the dedicated section of our community forums located here http://forums.gdpicture.com/ocr-tesseract/
OCR Tesseract Plugin For GdPicture.NET 9
|Site worldwide license||3594.00||worldwide|
All licenses include royalty-free distribution with your application or system.
License price is the same for server- or desktop- deployment, we do not charge extra fee for server deployment.
Software license key will continue to be valid for all future 9.X versions of GdPicture.NET with free upgrades.
Per developer licenses: This license type entitles the specified number of developer/build machine at a single physical address to write software with access to GdPicture.NET.
Site License: This license entitles an unlimited number of developers of the same organization at a single physical address to write software with access to GdPicture.NET.
Worldwide Site License: This license entitles an unlimited number of developers of the same organization at unlimited physical addresses to write software with access to GdPicture.NET.