Reach our technical support team on our helpdesk
The first article of our PDF Optimization In-depth Series is available here. The PDF format is interactive. During the release cycle of a PDF document, different people will use tools such as forms, annotations, attachments, and more. Archived PDF documents generally do not have the same use as those in circulation in a collaborative context. […]
Don’t get your fingers burned with uncovered sensitive data Have you ever heard of PDF Redaction? If not, well either you don’t have sensitive content to hide, either you are doing it the wrong way. If yes, ask yourself: am I redacting my files properly? In this article, you will learn how to redact PDF […]
Here are the links for the two previous articles of the series: A PDF document is composed of various data structures, described as different objects. In the scope of optimization, one is particularly important: the stream object. This object aims to store binary data representing the content of an image, font file, color profile, embedded […]
Text recognition machine learning is transforming OCR (optical character recognition) by using advanced algorithms to improve accuracy and speed. In the previous articles of our series, we covered some of the deep learning techniques used in text detection, which is the first part of any OCR system. In this article, we will cover the second […]
In the fast-paced world of digital data capture, businesses and developers rely on efficient barcode scanning solutions to streamline operations. Among the most versatile barcode types is the 2D Data Matrix barcode, widely used in manufacturing, healthcare, retail, and logistics. If you’re looking for a robust, accurate, and high-performance barcode scanning solution for .NET applications, […]
Sometimes, text in a scanned PDF isn’t selectable or searchable. This is where Optical Character Recognition (OCR) comes in. By using OCR, you can extract text from PDF files and save it in a file for editing or further processing. In this guide, you’ll learn how to use GdPicture.NET’s OCR engine to recognize text from […]
This paper was originally a presentation in French delivered at the PDF Day – France by Loïc Carrère, CEO of ORPALIS, in April 2019. Organized by the PDF Association, the PDF days are the meeting place of the PDF industry, where experts conduct educational (non-commercial) presentations, panel, and discussion-based sessions about the format. The richness […]
To export a table from pdf to excel, you need an efficient tool that ensures accurate data extraction while preserving the structure. This process is essential when dealing with complex tables in PDFs that need to be converted into editable, usable formats like Excel. Various methods, including software solutions and online tools, can help streamline […]
In this third article of the Table Extraction series, we’ll see how the GdPicture.NET engine goes beyond OCR and Deep Learning methods thanks to the Layout Understanding approach. You can read the first articles of the Table Extraction series here Part 1: ChallengesPart 2: Deep Learning Approaches The Layout Understanding approach A performant table extraction […]
To Extract tables from PDFs and electronic documents can be easy or very complex, depending on the nature of the file. In this series, we’ll see how automatic table extraction can help companies overcome various challenges. We will also compare the different approaches available on the market (OCR, Deep Learning, and Layout Analysis) and tell […]