GdPicture.NET OCR Engine: Improvements and New Features
ORPALIS is pleased to announce a major update of the GdPicture.NET Document Imaging SDK OCR engine with better speed, better accuracy, and a new connector for external OCR engines.
Muret, France, 28th October 2019
Enhancement of the OCR engine
- Dramatically reduced average processing time. The engine may be up to 20x faster on very complex documents.
- Better text writing direction and orientation detection.
- Improved detection in bitmap with non-uniform background.
- Better text detection in complex regions with noise in the background, diagrams, and dotted areas.
- Dramatically enhanced PDF-OCR generation.
Integration of external engines
Since version 14.1.39, it is possible to link any external OCR engine to a GdPicture.NET application during PDF/OCR generation.
A tutorial is available in the documentation
A few facts about the Gdpicture.NET OCR engine
- 130 languages are recognized.
- Thanks to the research work made for building the GdPicture.NET MRC compression engine, the adaptive pre-processing and pre-segmentation phases leverage subsequent OCR accuracy and speed.
- The post-processing phase allows an excellent reprocessing of false-positive results, thanks to many years of experimentation on millions of documents.
- The Tesseract-based OCR engine has been modified to be continuously optimized to match the performances (and often exceeds them on complex documents) of established competitors.
- NET OCR includes multiple of its own segmentation engines.