GdPicture.NET Logo

Royalty-Free OCR SDK & Searchable PDF Toolkit for GdPicture.NET SDK

Looking for a strong OCR SDK? GdPicture OCR is a 100% royalty-free Optical Character Recognition engine to develop applications requiring OCR technology.

Developers can add robust, fast & multi-threaded OCR support in managed and non managed applications with few lines of code.

60 Days Free Trial Download Gdpicture.NET Now!

Royalty-Free OCR SDK & Searchable PDF Toolkit for GdPicture.NET SDK

GdPicture.NET OCR SDK

GdPicture OCR SDK

Based on a continuously improved version of the Google’s open source Tesseract OCR engine, the GdPicture OCR Tesseract Plugin adds features to GdPicture.NET such as text recognition on a specific area of an image and the ability to create searchable PDF/A files (PDF-OCR) from scanned documents, images or existing PDF documents.

GdPicture OCR Tesseract Plugin offers built-in Multi-threading support, handles more than 100 languages (full list here) and can process more than 100 document formats.

Main features

  • OCR SDK with full Unicode support.
  • Multi-thread support (demo application included in the GdPicture.NET SDK package).
  • Character recognition confidence.
  • Retrieves characters location.
  • Retrieve fonts information (style, family...).
  • Retrieve paragraphs information (justification, alignment, bounding box...)
  • Output text.
  • Support for PDF/A OCR generation (PDF Image + hidden searchable text).
  • Can produce PDF & PDF/A with Unicode characters with very small size.
  • Supports more than 100 languages such as English, French, Italian, German, Spanish, Brazilian Portuguese, Vietnamese, Chinese, Russian, Polish, Dutch, etc.
  • Can recognize only digits, only alpha or only “white listed” characters. Plus option to specify black list of characters.
  • OCR context support. Defines if the engine is processing a document, single word, single character, text block, vertical text etc.
  • Fast area processing.
  • Automatic document orientation detection.
  • Automatic skew correction.
  • Automatic image correction to increase OCR accuracy and speed.
  • Segmentation features to detect block, paragraphs, lines, words and characters.
  • Fully customizable through variables.
  • Built-in multi-threaded engine for PDF/OCR creation.
  • Recognize and convert more than 100 formats to DOCX, HTML, PDF, and text files.
  • Any-CPU: available in 32-bit & 64-bit versions.
  • Can work in multi-thread applications.
  • And more than 100 other features...

 

In order to OCR PDF files, the GdPicture Managed PDF plugin is required
MICR features are included in the OCR Plugin. Learn more

GdPicture.NET - How to use

How to use the GdPicture Tesseract OCR SDK

Download and install GdPicture.NET package from here.

You will be able to find some compiled demo applications in
[Install directory]\Samples\Bin\
You will be able to find C# and VB.NET demo applications including source code in
[Install directory]\Samples\WinForm\
You will find other code snippets within the online reference guide found here http://guides.gdpicture.com
You can find some discussions about the GdPicture Tesseract OCR Plugin in the dedicated section of our community forums located here Tesseract OCR