14.2: Reinventing GdPicture.NET with our Most Important Release Yet
GdPicture.NET 14.2 introduces cross-platform support and a comprehensive set of intelligent document processing tools, making it our largest release yet. Let us show you the new features. Welcome to the new GdPicture.NET!
- GdPicture.NET and DocuVieware are now cross-platform
- New Intelligent Document Processing set of tools
- And we have more!
- Simplified licensing management
We are thrilled to introduce the new GdPicture.NET and DocuVieware versions that include Linux support, PostScript file format support, and a large set of AI capabilities.
We’ve been working on this release for a year, and thanks to the additional resources provided by joining the PSPDFKit team, we were able to fuel the development of your favorite and unrivaled Intelligent Document Processing SDK.
GdPicture 14.2 marks the beginning of a long series of versions dedicated to delivering very advanced document capabilities to any kind of application.
GdPicture.NET and DocuVieware are now cross-platform
You will find new .NET 6 assemblies for GdPicture.NET & DocuVieware with Linux x64 & arm64 compatibility. More platforms will be supported very soon.
Cross-platform support is available in the GdPicture.NET.14.API assembly.
New Intelligent Document Processing set of tools
Intelligent Document Processing (IDP) is a set of machine vision and AI technologies that helps access and process the unstructured and semi-structured data of electronic documents so the information can be reused easily.
All industries can benefit from integrating an IDP solution in their systems, as it helps mitigate errors and saves a lot of time.
90% of business documents are unstructured with data not readily available for reuse.
The GdPicture.NET IDP tools, just like our OCR engine, are designed to be generic.
Focusing on a generic approach means that the engine is built to give the best results on all documents, and not just in specific contexts. This also means that the engine can batch-process thousands of documents as is, without setting up specific parameters.
We favor this approach to help our users with business documents, such as invoices, quotes, receipts, payment slips, etc. These documents are heterogeneous and found in various structured, unstructured, and semi-structured formats: native PDFs, image PDFs, office formats (text, spreadsheet, presentation), raster and vector images, emails, HTML, and more.
The new Intelligent Document Processing category includes Key-Value Pair extraction, Smart Redaction, and Table Extraction.
What are the Intelligent Document Processing technologies?
The GdPicture.NET IDP tools rely on various technologies, including heuristics, mathematics, and Artificial Intelligence capabilities, including:
- Document Layout Analysis (DLA)
- Optical Character Recognition (OCR)
- Key-Value Pair extraction (KVP)
- Natural Language Processing (NLP)
- Named-Entity Recognition (NER)
If you want to know more about these technologies, you can find information on the Intelligent Document Processing page.
None of these technologies works in isolation
Key-Value Pair extraction
Most of you already know our Key-Value Pair extraction engine, as it is the first IDP tool we released before GdPicture.NET 14.2.
Key Value Pairs are two related data items, a key, and a value. The key defines the data and is fixed, and the value is variable and describes the key.
The GdPicture.NET KVP extraction engine also provides two additional fields besides Key and Value: Type and Accuracy. Type provides the nature of the content (name, phone number, credit card number, etc.). Accuracy is a confidence level computed by considering various parameters (OCR results at character and word levels, type of key, position on the page, and more).
KVP extraction is an essential tool as it is one of the underlying technology for Smart Redaction and Table extraction.
Smart Redaction
The Gdpicture.NET SDK now includes two sets of redaction tools: PDF Redaction and Smart Redaction.
What distinguishes Smart Redaction from PDF Redaction is that it does the heavy work for you, automatically.
Where PDF Redaction is either manual or semi-automated with the help of regular expressions, Smart Redaction retrieves and permanently removes personal information from documents automatically.
Thanks to the various Intelligent Document Processing technologies, the Smart Redaction engine is particularly efficient on poor-quality and scanned documents. It also retrieves data trapped in tables, graphics, and images.
Table Extraction
Tables and cells recognition
The challenges with tables are just too many to list as it’s not that often that you can find a nice and clearly defined table in a business document.
Tables can be split into several pages; sometimes, they don’t look like tables at all; more often than not cells have a background color that makes it difficult to read their content. And then there are poorly scanned documents with skewed tables.
The GdPicture.NET Table Extraction engine overcomes all these challenges.
Pure AI engines usually do well with extracting content from scanned documents. However, they use up a lot of resources and tend to be very slow, even on powerful machines.
Unlike solutions that rely on AI technology only, our engine is fully adaptive, meaning that it automatically uses the best approach (using, heuristics, mathematics, or deep learning depending on the context) to extract data.
And we have more!
Post-Script support (BETA)
PostScript files (PS – Adobe PostScript format) can now be opened for viewing and PDF conversion purposes.
This new feature is in a preview stage and will be quickly improved over time, particularly with the addition of EPS – Encapsulated PostScript.
Fuzzy search
The search feature has been enhanced with fuzzy search.
This means that instead of finding the exact words you’re looking for in a document, the engine can look for a semantically close result. Fuzzy search is particularly useful when looking for personal information which is not precisely defined or known beforehand.
Changelog
This new version also comes with several improvements to the PDF, OCR, and Formats SDKs.
As usual, you will find the full changelogs for GdPicture.NET and DocuVieware on their respective Version history pages:
Simplified licensing management
This new version comes with a new simplified licensing mechanism.
A new license key is required to unlock it, which can be immediately obtained by contacting our Customer Success Team.
We hope you will like this new version as much as we do!
Cheers!
Elodie
Tags: