The GdPicture.NET Intelligent Document Processing tools rely on various technologies, including heuristics, mathematics, and Artificial Intelligence capabilities while making the best use of resources available.
Document Layout Analysis is the identification and categorization of regions on a document.
It implies a geometric analysis of tables, pictures, equations, and barcodes and a logical layout analysis (paragraphs, lines, words, characters) of the document.
For Intelligent Document Processing purposes, a traditional/standard OCR is not enough, especially in everything that is not typed text on a perfectly white background. So, for documents with:
Traditional OCR won’t work well.
This also means that solutions built on this system are also hard to scale because they will require a lot of verification.
The GdPicture.NET IDP tools use its own OCR engine combined with AI technologies like machine learning and deep learning, to mitigate the traditional OCR limitations.
Key-Value Pairs are two related data items, a key, and a value. The key defines the data and is fixed, and the value is variable and describes the key.
NLP is an AI technology that enables machines to understand human speech in text or voice form to communicate with humans in their own natural language.
NLP is essential for extracting data from unstructured documents, as it is, with deep learning, the technology that will make sense of the information extracted.
NER is a form of Natural Language Processing (NLP), a subfield of artificial intelligence.
It is a sub-task of information extraction that tries to locate and classify named entities in unstructured text into predefined categories such as a person's name, ID number, address, organization, etc. This technology is used for key-value pair extraction and smart redaction in unstructured/semi-structured documents.
The GdPicture.NET KVP extractor is able to understand there is information with labels and values in a document, extract them and qualify the value, in an instant.
Secure, automate, and accelerate the process of removing personal and sensitive information from your electronic documents with the AI-powered GdPicture.NET Smart Redaction engine.
Extract all data trapped in tables automatically, even on scanned and low-quality documents, with the innovative GdPicture.NET Table Extraction engine.