How to redact sensitive PDF content…efficiently.


An illustration of a redacted document blog post

Don’t get your fingers burned with uncovered sensitive data

Have you ever heard of PDF Redaction? If not, well either you don’t have sensitive content to hide, either you are doing it the wrong way. If yes, ask yourself: am I redacting my files properly? In this article, you will learn how to redact PDF files efficiently thanks to GdPicture.net SDK!

Indeed, hiding sensitive content is all but trivial. Back in the times when it was all about paper, a strong coat of black ink was enough to hide your most dirty secrets. Nowadays, at the digital age, you need stronger methods. If you thought PDF Redaction consisted in adding overlaying black rectangles, you have it all wrong.

This is an illustration of a wrong PDF redaction process. Adding overlying black rectangles won't prevent people from unveiling hidden contents.
Using overlaying rectangles won’t ensure a correct PDF redaction. Hidden contents can be easily exposed using third-parties software.

Ask “The New York Times”! In 2014, they released an NSA report and they thought they took all the needed measures to hide sensitive content. However, the level of redaction was so weak that people could unveil the masks and reveal the names of operating agents. Embarrassing!

Speaking of NSA, these are people you can trust when speaking of how you should properly “redact” sensitive content:

“the way to avoid exposure is to ensure that sensitive information is not just visually hidden or made illegible, but is actually removed from the original document.”

U.S. National Security Agency (NSA)

 Source: http://www.ca7.uscourts.gov/guides/nsa-redact.pdf

Opt-in for an efficient PDF redaction

The good news is that GdPicture perfectly addresses this objective. PDF Redaction in our SDK not only hides sensitive content but deeply removes the underlying content. Even if some nasty hacker wanted to harm your company by revealing hidden content, he would find nothing but emptiness.

This is an illustration of a good PDF redaction process. "Hidden" contents is actually removed from the contents. Removing black rectangles won't reveal sensitive data.
GdPicture offers a strong PDF Redaction method. Hidden content is actually removed from the file. This warranties a high level of confidentiality.

Question is: how does it work? Well, it’s pretty straightforward. Follow these three steps:

  1. Select the page using SelectPage
  2. Define area using AddRedactionRegion
  3. Hide contents using ApplyRedaction

Remember, hiding means truly deleting content, not putting some easily removable content on top. Once you hide something with GdPicture, you actually remove the bytes from the stream. Hasta la vista, gone…

Note that you can define multiple areas before calling the ApplyRedaction method. See the following snippet for a better understanding:

//We assume that GdPicture has been correctly installed and unlocked.
using (GdPicturePDF oGdPicturePDF = new GdPicturePDF())
{
    //Loading of the PDF document you want to redact.
    oGdPicturePDF.LoadFromFile("file_to_redact.pdf", false);

    //Set coordinate origin location
    oGdPicturePDF.SetOrigin(PdfOrigin.PdfOriginTopLeft);

    //Select page
    oGdPicturePDF.SelectPage(1);

    //Add redaction region on first page covering sensitive information
    oGdPicturePDF.AddRedactionRegion(420, 170, 90, 15);

    //Select different page
    oGdPicturePDF.SelectPage(2);

    //Add another redaction region on different page
    oGdPicturePDF.AddRedactionRegion(420, 170, 90, 15);

    //After all desired redaction regions have been added apply the redaction
    oGdPicturePDF.ApplyRedaction();

    //Save the redacted document
    oGdPicturePDF.SaveToFile("output.pdf");
}

If you want to reset your areas definition, you can call ClearRedactionRegions method. But keep in mind that it won’t restore the initial document contents if you ever call ApplyRedaction in the meantime. Once deleted, forever gone. And that’s how you redact a PDF file efficiently!

When PDF redaction helps to get GDPR compliant

Let’s be honest, many companies still have no clues about what GDPR means exactly and how they are concerned. Let’s try to explain it in simple terms. This regulation intends to protect European citizens from the misuse of their personal data.

This is a picture of a computer displaying an image of a lock inside the European flag as a metaphor of the GPDR rule.

Known as the “right to be forgotten”, the philosophy behind GPDR is to allow anyone to request companies owning their data to simply delete it. Does this make sense now?

Yes, but how is that even close to PDF Redaction concerns? Well, as your PDF files may contain personal data (think of invoices, statements…), you may be required to track this personal data use and ensure the customer (and the legislator) that you are in the capacity of removing the files if requested. A few companies were already heavily fined for not respecting this commitment.

Using PDF Redaction is a convenient way to stick to the rules. By deleting sensitive content, you just made yourself compliant with the regulation. Indeed, your documents no longer expose personal data.
See? Problem gone!

Why GdPicture is a wise choice

PDF Redaction is a serious topic and you have to be really careful with the technology you are using. You may find many apps offering redacting features but will they really protect you from disclosure? GdPicture.NET does.

Want to see PDF Redaction in action in our SDK? Why don’t you try the demo?

60 Days Free Trial Download GdPicture.NET Now!

Cheers,

Loïc and Elodie


Tags: