The engine is able to understand there is information with labels and values in a document, extract them and qualify the value, in an instant.
Key Value Pairs are two related data items, a key, and a value. The key defines the data and is fixed, and the value is variable and describes the key.
Date (key) : 06/14/22 (value)
Depending on the type of document, the key-value pair fields are different. For instance, the ones on an invoice will be different than the ones on a survey or a government form.
It’s easy to get key-value pairs from structured documents like excel files because the values are all named.
Key-value pair fields for invoices can be:
Any document that does not have a pre-defined data model or is not organized in a pre-defined manner have unstructured data, which represents about 90% off all documents generated.
For these documents, you will need a KVP extraction engine to retrieve the information.
The KVP extract engine is fully part of the GdPicture.NET OCR engine and like the other OCR technologies (MICR, MRZ, OMR, contextual OCR, and more), it benefits from a hybrid approach that includes heuristics, mathematics, and ML capabilities.
We also use an adaptive layout understanding and the same underlying elements techniques as NLP technologies.
The GdPicture.NET engine automatically adapts to the document and searches for the right approach, making the best use of resources available.
This approach allows us to have excellent results on the usual weaknesses of traditional OCR and pure Machine Learning engines, especially with:
The GdPicture.NET KVP extraction engine also provides two additional fields besides Key and Value: Type and Accuracy.
You will find a compiled demo applications in
You will find C# and VB.NET demo applications including source code in
You will find other code snippets within the online reference guide found here GdPicture.NET Guides