GdPicture.NET OCR Engine: Improvements and New Features

Press Releases

ORPALIS is pleased to announce a major update of the GdPicture.NET Document Imaging SDK OCR engine with better speed, better accuracy, and a new connector for external OCR engines.

Muret, France, 28th October 2019

Enhancement of the OCR engine

  • Dramatically reduced average processing time. The engine may be up to 20x faster on very complex documents.
  • Better text writing direction and orientation detection.
  • Improved detection in bitmap with non-uniform background.
  • Better text detection in complex regions with noise in the background, diagrams, and dotted areas.
  • Dramatically enhanced PDF-OCR generation.

Integration of external engines

Since version 14.1.39, it is possible to link any external OCR engine to a GdPicture.NET application during PDF/OCR generation.
A tutorial is available in the documentation

A few facts about the Gdpicture.NET OCR engine

  • 130 languages are recognized.
  • Thanks to the research work made for building the GdPicture.NET MRC compression engine, the adaptive pre-processing and pre-segmentation phases leverage subsequent OCR accuracy and speed.
  • The post-processing phase allows an excellent reprocessing of false-positive results, thanks to many years of experimentation on millions of documents.
  • The Tesseract-based OCR engine has been modified to be continuously optimized to match the performances (and often exceeds them on complex documents) of established competitors.
  • NET OCR includes multiple of its own segmentation engines.