OCR SDK
GdPicture includes an Optical Character Recognition engine to develop any kind of application requiring OCR technology.
With GdPicture OCR SDK, put the power of more than 15 years of continuously improved technologies into your own application.
GdPicture OCR SDK
Based on a continuously improved technology, the GdPicture OCR engine provides features such as text recognition on a specific area of an image and the ability to create searchable PDF/A files (PDF-OCR) from scanned documents, images or existing PDF documents.
The GdPicture OCR engine offers built-in Multi-threading support, handles more than 100 languages (full list here) and can process more than 100 document formats.
Main features
Try with your document
Other OCR technologies
ADR – Automatic Document Recognition
The GdPicture.NET ADR engine is designed for automatic document classification and categorization tasks in a document and information management system. It allows your applications to identify invoices, checks, forms, orders, delivery notes, page separators, or any kind of structured document.
MICR – Magnetic Ink Character Recognition
The GdPicture.NET MICR SDK allows decoding “CMC7” and “E-13B” characters from documents with outstanding speed and accuracy.
It can also detect and decode the MICR line from any structured document such as checks by analyzing the full page layout.
MRZ – Machine Readable Zone
ID documents like passports, visas, and other ID cards contain a Machine Readable Zone (MRZ) which makes them readable by machines. The GdPicture.NET MRZ recognition engine allows you to create applications to extract and decode MRZ characters on all types of documents.
OMR – Optical Mark Recognition
The GdPicture.NET OMR engine helps to detect the content of a checkbox, fill-in-area, multiple-choice examination form, or any area where highlighting is required to indicate a certain choice.
It also provides an anchoring mechanism (also known as template recognition) to specify the area that needs to be processed.
MRC – Mixed Raster Content
The GdPicture.NET MRC engine is producing spectacular results by automatically adjusting the tradeoff between quality and compression rate to provide top quality PDF MRC documents at the lower possible size.
It uses very elaborated adaptative document learning algorithms permitting to identify and classify any form of any nature very quickly.
ICR – Intelligent Character Recognition
The GdPicture.NET ICR engine expands the machine vision capabilities of the OCR SDK. At the moment, it recognizes handwritten numerics located in boxes. The next versions will support more contexts.
KVP – Key-Value Pair Extraction
Bring Intelligent Document Understanding and Processing features to your unstructured and semi-structured documents with the new key-value pair data extractor.
The engine can instantly identify valuable information in a document, extract, and qualify it.
Example of usage
Creating a searchable PDF document from a scanned PDF document
How to automatically rotate pages of a multipage TIFF file using OCR
How to OCR a multipage TIFF image
How to OCR a specific zone of a PDF document
How to convert TIFF images into searchable PDF documents in a multithreaded environment
Making searchable PDF document from any document using scanner
How to use the GdPicture.NET OCR SDK
Download and install GdPicture.NET package from here.
You will be able to find some compiled demo applications in
[Install directory]\Samples\Bin\
You will be able to find C# and VB.NET demo applications including source code in
[Install directory]\Samples\WinForm\
You will find other code snippets within the online reference guide found here
You can find some discussions about the GdPicture Tesseract OCR Plugin in the dedicated section of our community forums located here