.NET OCR SDK with AI-powered text recognition

Build intelligent document processing applications with GdPicture .NET OCR SDK. Extract text from images and PDFs with more than 99 percent accuracy across 100+ languages using AI and ML technology. Built-in multithreading, automatic preprocessing, and enterprise-grade reliability — without the complexity and support gaps of open source solutions.

FREE DOWNLOAD SCHEDULE A DEMO

Advanced OCR capabilities

Extract text with more than 99 percent accuracy using AI-powered recognition across 100+ languages. Built-in preprocessing automatically corrects image quality issues — deskew, denoise, and enhance characters. Flexible OCR modes, character filtering, and zonal extraction enable precise control. Create searchable PDFs and process 100+ document formats with multithreaded performance.

Unicode support

Full Unicode support for accurate recognition and output of multilingual text. Process and generate PDFs with Unicode characters in any size. Built-in support for 30+ languages, including English, French, Italian, German, Spanish, Portuguese, Vietnamese, Chinese, Russian, Polish, and Dutch. Extend to 100+ additional languages with Tesseract packs.

Character detection

Advanced character recognition with confidence scoring and precise location data. Configure character allowlists (digits only, alpha only) or denylists to improve accuracy. Retrieve character bounding boxes for exact positioning. Define OCR context — document, page, paragraph, block, line, or word level — to optimize recognition for your specific use case.

Structure extraction

Extract rich text metadata — including font information (style, family), formatting details (bold, italic), and layout properties (justification, alignment, bounding boxes). Intelligent segmentation detects text blocks, paragraphs, lines, words, and individual characters. Output structured text with accurate positioning for downstream document analysis and data extraction workflows.

Image correction

Automatic preprocessing improves OCR accuracy without manual intervention. Built-in capabilities include deskew (orientation correction), paragraph detection, noise removal, character enhancement, and line/punch hole removal. Fast area processing accelerates operations on selected regions. Intelligent corrections deliver high-quality results from poor-quality scans.

Format conversion

Generate searchable PDFs with embedded text layers and PDF/A-4f archival compliance. Our multithreaded engine converts 100+ formats — images, Office documents, CAD files — to searchable PDFs. Recognize and convert documents to DOCX, HTML, PDF, and text formats. Flexible output options ensure broad compatibility and document reuse across your workflows.

System integration

Seamlessly integrates with the .NET SDK’s 100+ document processing features. Multithreaded support for high-volume batch processing with configurable CPU limits. 32-bit and 64-bit compatibility across .NET Framework, .NET Core, and .NET 6+. Works with external Tesseract engines for extended language support. Enterprise-grade architecture scales from single documents to high-volume automated workflows.

Complementary support

ADR

MICR

MRZ

OMR

MRC

KVP

Highlights

Searchable output

Convert scanned documents, images, and existing PDFs into searchable PDF/A files with embedded text layers. Our AI-powered OCR engine extracts text and preserves it invisibly behind the original image, enabling full-text search while maintaining visual fidelity. Create PDF/A-4f-compliant archives for long-term document preservation with perfect searchability.

Multithreading performance

Built-in multithreading processes multiple pages simultaneously for faster OCR operations. Configurable CPU limits optimize performance across diverse workloads — from single documents to high-volume batch processing. Scale seamlessly from desktop applications to enterprise document automation workflows with intelligent thread allocation.

Language coverage

Process documents in 100+ languages with 30+ built-in language packs and support for 120+ additional Tesseract languages. Full Unicode support ensures accurate recognition of multilingual content, including English, Chinese, Arabic, Russian, Japanese, and European languages. Recognize multiple languages in a single document for international workflows.

Demo

Test OCR accuracy with your documents

Upload scanned images, PDFs, or photos to evaluate our OCR engine’s performance.

Other OCR technologies

ADR — Automatic document recognition

Automatically classify and categorize documents using ML-based template matching. Identify document types — invoices, checks, forms, purchase orders, delivery notes — with confidence scoring. Create templates from sample documents and match new files against your library for intelligent document routing and workflow automation.

MICR — Magnetic ink character recognition

Extract MICR data from bank checks using E-13B and CMC-7 font recognition. Automatically decode routing numbers, account numbers, and check numbers from the magnetic ink line at the bottom of checks. Fast, accurate processing for financial document automation and payment workflows without manual region specification.

MRZ — Machine readable zone

Extract and decode machine readable zones from passports, visas, ID cards, and travel documents. Automatically recognize the standardized MRZ text at the bottom of identity documents and parse it into structured data, including names, document numbers, dates, nationalities. Essential for border control and any identity verification.

OMR — Optical mark recognition

Detect filled bubbles, checkboxes, and marks on forms for automated data capture. Process multiple choice exams, questionnaires, surveys, and any document requiring mark detection. Returns binary results (filled/unfilled) with confidence scores. Anchoring technology compensates for document rotation and distortion.

MRC — Mixed raster content

Advanced image segmentation technology for hyper-compression and quality optimization. Automatically separates text, graphics, and images into distinct layers, applying optimal compression to each. Reduce PDF file sizes up to 95 percent while maintaining or improving visual quality. Adaptive algorithms learn document structure for intelligent compression decisions.

ICR — Intelligent character recognition

Recognize handwritten text with AI-powered character recognition. Currently supports handwritten numerics in boxes, with expansion planned for additional contexts. Ideal for processing handwritten forms, applications, and documents where printed text isn’t available. Machine learning algorithms continuously improve accuracy across diverse handwriting styles.

KVP — Key-value pair extraction

Automatically extract structured data from unstructured and semi-structured documents using intelligent KVP. Identify and extract critical information — invoice numbers, dates, totals, addresses — without manual templates. Our AI engine recognizes document layouts and extracts relevant data for downstream validation and automation.

Get started

How to use

Download and install the GdPicture.NET package to access compiled demo applications and multi-language sample projects with full source code.

FREE DOWNLOAD

Explore demo apps

Find compiled demo applications in \Samples\Bin\.

Explore multi-language source code

Find C# and VB.NET demo apps and source code in \Samples\WinForm\.

Visit reference guide

Explore other code snippets within the online reference guide.

1
using GdPictureImaging gdpictureImaging = new GdPictureImaging();
2
// Select the image to process.
3
int imageID = gdpictureImaging.CreateGdPictureImageFromFile(@"C:\temp\source.png");
4
// Scan the barcodes.
5
gdpictureImaging.Barcode1DReaderDoScan(imageID);
6
// Determine the number of scanned barcodes.
7
int barcodeCount = gdpictureImaging.Barcode1DReaderGetBarcodeCount();
8
string content = "";
9
if (barcodeCount > 0)
10
{
11
    content = "Number of barcodes scanned: " + barcodeCount.ToString();
12
    // Save the value of each barcode.
13
    for (int i = 1; i <= barcodeCount; i++)
14
    {
15
        content += $"\nBarcode Number: {i} Value: {gdpictureImaging.Barcode1DReaderGetBarcodeValue(i)}";
16
    }
17
}
18
// Write the values to the console.
19
Console.WriteLine(content);
20
// Release unnecessary resources.
21
gdpictureImaging.Barcode1DReaderClear();
22
gdpictureImaging.ReleaseGdPictureImage(imageID);

1
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
2
    ' Select the image to process.
3
    Dim imageID As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:\temp\source.png")
4
    ' Scan the barcodes.
5
    gdpictureImaging.Barcode1DReaderDoScan(imageID)
6
    ' Determine the number of scanned barcodes.
7
    Dim barcodeCount As Integer = gdpictureImaging.Barcode1DReaderGetBarcodeCount()
8
    Dim content = ""
9
    If barcodeCount > 0 Then
10
        content = "Number of barcodes scanned: " & barcodeCount.ToString()
11
        ' Save the value of each barcode.
12
        For i = 1 To barcodeCount
13
            content = content & vbLf & "Barcode Number: " & i.ToString() & "    Value: " & gdpictureImaging.Barcode1DReaderGetBarcodeValue(i).ToString()
14
        Next
15
    End If
16
    ' Write the values to the console.
17
    Console.WriteLine(content);
18
    ' Release unnecessary resources.
19
    gdpictureImaging.Barcode1DReaderClear()
20
    gdpictureImaging.ReleaseGdPictureImage(imageID)
21
End Using

C#
VB.NET

1
using GdPicturePDF gdpicturePDF = new GdPicturePDF();
2
using GdPictureImaging gdpictureImaging = new GdPictureImaging();
3
using GdPictureOCR gdpictureOCR = new GdPictureOCR();
4
// Select the image to process.
5
int imageID = gdpictureImaging.CreateGdPictureImageFromFile(@"C:\temp\source.png");
6
// Set the OCR parameters.
7
gdpictureOCR.SetImage(imageID);
8
gdpictureOCR.ResourceFolder = @"C:\GdPicture.NET 14\Redist\OCR";
9
gdpictureOCR.AddLanguage(OCRLanguage.English);
10
// Run the OCR process.
11
string resID = gdpictureOCR.RunOCR();
12
// Get the result of the OCR process as text.
13
string content = gdpictureOCR.GetOCRResultText(resID);
14
// Save the result in a PDF document.
15
gdpicturePDF.CreateFromText(PdfConformance.PDF, 595, 842, 10, 10, 10, 10,
16
TextAlignment.TextAlignmentNear, content, 12, "Arial", false, false, true, false);
17
gdpicturePDF.SaveToFile(@"C:\temp\output.pdf");
18
gdpictureImaging.ReleaseGdPictureImage(imageID);

1
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
2
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
3
Using gdpictureOCR As GdPictureOCR = New GdPictureOCR()
4
    ' Select the image to process.
5
    Dim imageID As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:\temp\source.png")
6
    ' Set the OCR parameters.
7
    gdpictureOCR.SetImage(imageID)
8
    gdpictureOCR.ResourceFolder = "C:\GdPicture.NET 14\Redist\OCR"
9
    gdpictureOCR.AddLanguage(OCRLanguage.English)
10
    ' Run the OCR process.
11
    Dim resID As String = gdpictureOCR.RunOCR()
12
    ' Get the result of the OCR process as text.
13
    Dim content As String = gdpictureOCR.GetOCRResultText(resID)
14
    ' Save the result in a PDF document.
15
    gdpicturePDF.CreateFromText(PdfConformance.PDF, 595, 842, 10, 10, 10, 10, TextAlignment.TextAlignmentNear, content, 12, "Arial", False, False, True, False)
16
    gdpicturePDF.SaveToFile("C:\temp\output.pdf")
17
    gdpictureImaging.ReleaseGdPictureImage(imageID)
18
End Using
19
End Using
20
End Using

C#
VB.NET

1
using GdPictureImaging gdpictureImaging = new GdPictureImaging();
2
using GdPicturePDF gdpicturePDF = new GdPicturePDF();
3
// Store the handle of the active windows in a variable.
4
IntPtr WINDOW_HANDLE = IntPtr.Zero;
5
// Select the scanner.
6
gdpictureImaging.TwainSelectSource(WINDOW_HANDLE);
7
gdpictureImaging.TwainOpenDefaultSource(WINDOW_HANDLE);
8
// (Optional) Hide the scanning user interface.
9
gdpictureImaging.TwainSetHideUI(true);
10
// Create the destination PDF document.
11
gdpicturePDF.NewPDF(PdfConformance.PDF);
12
// Get the image from the scanner.
13
int imageID = gdpictureImaging.TwainAcquireToGdPictureImage(WINDOW_HANDLE);
14
// Add the scanned image to a new page in the destination document.
15
gdpicturePDF.AddImageFromGdPictureImage(imageID, false, true);
16
// Run the OCR process.
17
gdpicturePDF.OcrPage("eng", @"C:\GdPicture.NET 14\Redist\OCR", "", 300);
18
// Save the result in a PDF document.
19
gdpicturePDF.SaveToFile(@"C:\temp\output.pdf");
20
// Release unnecessary resources.
21
gdpictureImaging.ReleaseGdPictureImage(imageID);
22
gdpictureImaging.TwainCloseSource();

1
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
2
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
3
    ' Store the handle of the active windows in a variable.
4
    Dim WINDOW_HANDLE = IntPtr.Zero
5
    ' Select the scanner.
6
    gdpictureImaging.TwainSelectSource(WINDOW_HANDLE)
7
    gdpictureImaging.TwainOpenDefaultSource(WINDOW_HANDLE)
8
    ' (Optional) Hide the scanning user interface.
9
    gdpictureImaging.TwainSetHideUI(True)
10
    ' Create the destination PDF document.
11
    gdpicturePDF.NewPDF(PdfConformance.PDF)
12
    ' Get the image from the scanner.
13
    Dim imageID As Integer = gdpictureImaging.TwainAcquireToGdPictureImage(WINDOW_HANDLE)
14
    ' Add the scanned image to a new page in the destination document.
15
    gdpicturePDF.AddImageFromGdPictureImage(imageID, False, True)
16
    ' Run the OCR process.
17
    gdpicturePDF.OcrPage("eng", "C:\GdPicture.NET 14\Redist\OCR", "", 300)
18
    ' Save the result in a PDF document.
19
    gdpicturePDF.SaveToFile("C:\temp\output.pdf")
20
    ' Release unnecessary resources.
21
    gdpictureImaging.ReleaseGdPictureImage(imageID)
22
    gdpictureImaging.TwainCloseSource()
23
End Using
24
End Using

C#
VB.NET

1
using GdPicturePDF gdpicturePDF = new GdPicturePDF();
2
// Load the source document.
3
gdpicturePDF.LoadFromFile(@"C:\temp\source.pdf");
4
// Determine the number of pages.
5
int pageCount = gdpicturePDF.GetPageCount();
6
// Loop through the pages of the source document.
7
for (int i = 1; i <= pageCount; i++)
8
{
9
    // Select a page and run the OCR process on it.
10
    gdpicturePDF.SelectPage(i);
11
    gdpicturePDF.OcrPage("eng", @"C:\GdPicture.NET 14\Redist\OCR", "", 300);
12
}
13
// Save the result in a new PDF document.
14
gdpicturePDF.SaveToFile(@"C:\temp\output.pdf");
15
gdpicturePDF.CloseDocument();

1
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
2
    ' Load the source document.
3
    gdpicturePDF.LoadFromFile("C:\temp\source.pdf")
4
    ' Determine the number of pages.
5
    Dim pageCount As Integer = gdpicturePDF.GetPageCount()
6
    ' Loop through the pages of the source document.
7
    For i = 1 To pageCount
8
        ' Select a page and run the OCR process on it.
9
        gdpicturePDF.SelectPage(i)
10
        gdpicturePDF.OcrPage("eng", "C:\GdPicture.NET 14\Redist\OCR", "", 300)
11
    Next
12
    ' Save the result in a new PDF document.
13
    gdpicturePDF.SaveToFile("C:\temp\output.pdf")
14
    gdpicturePDF.CloseDocument()
15
End Using

C#
VB.NET

1
using GdPicturePDF gdpicturePDF = new GdPicturePDF();
2
using GdPictureImaging gdpictureImaging = new GdPictureImaging();
3
using GdPictureOCR gdpictureOCR = new GdPictureOCR();
4
// Select the image to process.
5
int imageID = gdpictureImaging.CreateGdPictureImageFromFile(@"C:\temp\source.png");
6
// Set the OCR parameters.
7
gdpictureOCR.SetImage(imageID);
8
gdpictureOCR.ResourceFolder = @"C:\GdPicture.NET 14\Redist\OCR";
9
gdpictureOCR.AddLanguage(OCRLanguage.English);
10
// Run the OCR process.
11
string resID = gdpictureOCR.RunOCR();
12
// Get the result of the OCR process as text.
13
string content = gdpictureOCR.GetOCRResultText(resID);
14
// Save the result in a PDF document.
15
gdpicturePDF.CreateFromText(PdfConformance.PDF_A_4f, 595, 842, 10, 10, 10, 10,
16
    TextAlignment.TextAlignmentNear, content, 12, "Arial", false, false, true, false);
17
gdpicturePDF.SaveToFile(@"C:\temp\output.pdf");
18
gdpictureImaging.ReleaseGdPictureImage(imageID);

1
Using gdpicturePDF As GdPicturePDF = New GdPicturePDF()
2
Using gdpictureImaging As GdPictureImaging = New GdPictureImaging()
3
Using gdpictureOCR As GdPictureOCR = New GdPictureOCR()
4
    ' Select the image to process.
5
    Dim imageID As Integer = gdpictureImaging.CreateGdPictureImageFromFile("C:\temp\source.png")
6
    ' Set the OCR parameters.
7
    gdpictureOCR.SetImage(imageID)
8
    gdpictureOCR.ResourceFolder = "C:\GdPicture.NET 14\Redist\OCR"
9
    gdpictureOCR.AddLanguage(OCRLanguage.English)
10
    ' Run the OCR process.
11
    Dim resID As String = gdpictureOCR.RunOCR()
12
    ' Get the result of the OCR process as text.
13
    Dim content As String = gdpictureOCR.GetOCRResultText(resID)
14
    ' Save the result in a PDF document.
15
    gdpicturePDF.CreateFromText(PdfConformance.PDF_A_4f, 595, 842, 10, 10, 10, 10,
16
        TextAlignment.TextAlignmentNear, content, 12, "Arial", False, False, True, False)
17
    gdpicturePDF.SaveToFile("C:\temp\output.pdf")
18
    gdpictureImaging.ReleaseGdPictureImage(imageID)
19
End Using
20
End Using
21
End Using

Trusted by 3,000+ customers and Fortune 500 companies

15Y+

More than 15 years of experience developing our SDK

10K+

Trusted by more than 10,000 developers

Frequently asked questions

What is the GdPicture.NET OCR SDK?

The GdPicture.NET OCR SDK is a comprehensive document imaging toolkit that provides powerful optical character recognition (OCR) capabilities for .NET applications. It enables developers to extract text from scanned documents, images, and PDFs with high accuracy using AI and machine learning technology.

Which languages does the GdPicture.NET OCR SDK support?

The SDK supports recognition of more than 130 languages, including complex scripts and right-to-left languages like Arabic and Hebrew. It comes with 30+ built-in language packs and supports 100+ additional Tesseract language packs for extended coverage.

How does the GdPicture.NET OCR SDK ensure high accuracy in text recognition?

The SDK employs advanced preprocessing and segmentation techniques to enhance OCR accuracy. Built-in image correction features like deskewing, denoising, and character enhancement automatically improve source quality before recognition. The AI-powered engine delivers more than 99 percent accuracy across diverse document types.

Can the GdPicture.NET OCR SDK convert documents into searchable PDFs?

Yes. The SDK enables the creation of searchable PDFs by embedding recognized text within the PDF as an invisible layer. This makes the content fully searchable and selectable while preserving the original document appearance. It also supports PDF/A archival compliance for long-term document preservation.

Is the GdPicture.NET OCR SDK compatible with multithreaded applications?

Yes. The SDK includes full multithreading support for high-performance document processing. You can configure CPU limits to optimize performance across diverse workloads — from single documents to high-volume batch processing in enterprise environments.

60-day free trial

Try GdPicture.NET now!

FREE DOWNLOAD CONTACT SALES