October 29, 2019 | Important features, OCR

OCR Support in GdPicture.NET: Updates and Strategies


Illustration for the article about new major features and improvements in the GdPicture.NET OCR engine.

Hi Everyone,

We have recently worked a lot on improving our OCR engine, and in today’s post, we’re going to review the new enhancements and features.
We are also going to talk about the different strategies you can choose to integrate OCR in your applications, whether you’re using the GdPicture.NET engine or not.

Enhancement of our OCR engine

First, we’re going to start with the latest updates of the GdPicture.NET OCR engine.

  • Dramatically reduced average processing time. The engine may be up to 20x faster on very complex documents.
  • Better text writing direction and orientation detection.
  • Improved detection in bitmap with non-uniform backgrounds.
  • Dramatically enhanced PDF-OCR generation.

If you wish to test our engine “live”, try our AvePDF OCR widget with your documents:

Integration of external engines

If you’re not using GdPicture.NET for OCR since version 14.1.39 it is possible to link any external OCR engine to your GdPicture.NET application during PDF/OCR generation.

We have developed a connector to make the job easy for you, with the tutorial below.
The example shown can be easily adapted with other engines; the method is the same.

Here are the step by step instructions:

  1. Tells to the instance to use an external OCR engine.

    gdpicturePDF.SetOverrideOcrEngine(true);

  2. Intercept the ExternalOcrRequest event.

    gdpicturePDF.ExternalOcrPageRequest += this.ExternalOcrRequest;

  3. Implement the logic to provide the OCR result through the ExternalOcrRequest event handler.
//this version is using the "gdpictureocr-json" model. (the recommended one).
private void ExternalOcrRequest(int ImageID, PdfOcrOptions PdfOcrOptions, out GdPictureStatus Status, out string ResultEncoding, out string OcrResult)
    {
        using (GdPictureOCR gdpictureOCR = new GdPictureOCR())
        {
            gdpictureOCR.ResourceFolder = PdfOcrOptions.ResourcePath;
            gdpictureOCR.AddCustomDictionary(PdfOcrOptions.Dictionary);
            gdpictureOCR.OCRMode = PdfOcrOptions.OCRMode;
            gdpictureOCR.EnableOrientationDetection = PdfOcrOptions.DetectOrientation;
            gdpictureOCR.EnableSkewDetection = PdfOcrOptions.DetectSkew;
            gdpictureOCR.SetImage(ImageID);
            string resultID = gdpictureOCR.RunOCR();
            Status = gdpictureOCR.GetStat();
            if (Status == GdPictureStatus.OK)
            {
                GdPictureOcrResult ocrResult = new GdPictureOcrResult()
                {
                    Paragraphs = new List<GdPictureOcrParagraph>(),
                    PageRotation = gdpictureOCR.GetOrientation()
                };
                for (int paragraphIdx = 0; paragraphIdx < gdpictureOCR.GetParagraphCount(resultID); paragraphIdx++)
                {
                    OCRBlockType blockType = gdpictureOCR.GetBlockType(resultID, gdpictureOCR.GetParagraphBlockIndex(resultID, paragraphIdx));
                    //rejecting non text block. if (blockType != OCRBlockType.CaptionText &&
                        blockType != OCRBlockType.FlowingText &&
                        blockType != OCRBlockType.HeadingText &&
                        blockType != OCRBlockType.PulloutText &&
                        blockType != OCRBlockType.VerticalText &&
                        blockType != OCRBlockType.Table)
                    {
                        continue;
                    }
                    GdPictureOcrParagraph paragraph = new GdPictureOcrParagraph()
                    {
                        Lines = new List<GdPictureOcrLine>()
                    };
                    ((List<GdPictureOcrParagraph>)ocrResult.Paragraphs).Add(paragraph);
                    int firstLineIdx = gdpictureOCR.GetParagraphFirstTextLineIndex(resultID, paragraphIdx);
                    int lineCount = gdpictureOCR.GetParagraphTextLineCount(resultID, paragraphIdx);
                    for (int lineIdx = firstLineIdx; lineIdx < firstLineIdx + lineCount; lineIdx++)
                    {
                        GdPictureOcrLine line = new GdPictureOcrLine()
                        {
                            Words = new List<GdPictureOcrWord>()
                        };
                        ((List<GdPictureOcrLine>)paragraph.Lines).Add(line);
                        int firstWordIdx = gdpictureOCR.GetTextLineFirstWordIndex(resultID, lineIdx);
                        int wordCount = gdpictureOCR.GetTextLineWordCount(resultID, lineIdx);
                        for (int wordIdx = firstWordIdx; wordIdx < firstWordIdx + wordCount; wordIdx++)
                        {
                            GdPictureOcrWord word = new GdPictureOcrWord()
                            {
                                Characters = new List<GdPictureOcrCharacter>()
                            };
                            ((List<GdPictureOcrWord>)line.Words).Add(word);
                            int firstCharacterIdx = gdpictureOCR.GetWordFirstCharacterIndex(resultID, wordIdx);
                            int characterCount = gdpictureOCR.GetWordCharacterCount(resultID, wordIdx);
                            for (int characterIdx = firstCharacterIdx; characterIdx < firstCharacterIdx + characterCount; characterIdx++)
                            {
                                int characterLeft = gdpictureOCR.GetCharacterLeft(resultID, characterIdx);
                                int characterTop = gdpictureOCR.GetCharacterTop(resultID, characterIdx);
                                int characterRight = gdpictureOCR.GetCharacterRight(resultID, characterIdx);
                                int characterBottom = gdpictureOCR.GetCharacterBottom(resultID, characterIdx);
                                GdPictureOcrCharacter character = new GdPictureOcrCharacter()
                                {
                                    BBox = new GdPictureOcrRect(characterLeft, characterTop, characterRight, characterBottom),
                                    Value = gdpictureOCR.GetCharacterValue(resultID, characterIdx)
                                };
                                ((List<GdPictureOcrCharacter>)word.Characters).Add(character);
                            }
                        }
                    }
                }
                ResultEncoding = "json";
                OcrResult = JsonConvert.SerializeObject(ocrResult);
            }
            else
            {
                ResultEncoding = OcrResult = null;
            }
        }
    }

A few facts about the GdPicture.NET OCR engine

If you’re still deciding which strategy to adopt for your applications, here are some facts we think set the GdPicture.NET OCR engine apart from integrated tools.

  • 130 languages are recognized.
  • Thanks to the research work made for building the GdPicture.NET MRC compression engine, the adaptive pre-processing and pre-segmentation phases leverage subsequent OCR accuracy and speed.
  • The post-processing phase allows an excellent reprocessing of false-positive results, thanks to many years of experimentation on millions of documents.
  • The Tesseract-based OCR engine has been updated to be continuously optimized to match the performances (and often exceeds them on complex documents) of established competitors. 
  • GdPicture.NET OCR includes multiple of its own segmentation engines.

We challenge our engine every day with your documents and try to get the best results for your needs, so let us know how we can help!

Cheers,

Elodie


Tags: