Protect Confidential Content

Sometimes, you need to check documents that are extremely confidential. You don't want the text being stored in some database somewhere where you have limited control over who can see it. Confidential checking solves this problem. When you enable confidential checking, text from the document is no longer stored in the Analytics database. The only person who has access to the full details of a check is the person who originally checked the document. 

Confidential Checking Is Different From Anonymous Checking

Acrolinx has a privacy-related feature that is sometimes referred to as anonymous checking. It's easy to get these features confused, so it's important to remember the following differences.

  • Confidential checking is designed to protect sensitive content.
    You can't tell what kind of content was checked but you can still see who ran the check.

  • Anonymous checking is designed to protect sensitive user data
    You can't tell who ran the check but you can still see what kind of content they checked.

Enable Confidential Checking

Confidential checking is enabled in Content Profiles, so make sure that your Content Profiles are set up to detect confidential documents. For more information about how to set up Content Profiles correctly, see the "Content Profiles" section.

 To enable confidential checking, follow these steps:

  1. Create a Content Profile or open an existing Content Profile.

  2. Select the option Check all content confidentially.
    Your changes take effect immediately.

What Happens After Confidential Checking Is Enabled

The new checking behavior is easy to observe if you use the Content Analyzer.

Suppose that you check a directory that contains confidential documents only. You won't see any checking statistics when you open the Content Analysis Dashboard. This is because statistics about confidential documents aren't written to the Analytics database.

If you check a directory that contains a mixture of confidential and non-confidential documents, the Content Analysis Dashboard only shows statistics for the non-confidential documents.

Generally, this means that confidential documents don’t affect your company quality statistics at all. For example, suppose that you check a high volume of confidential content which contains too many long sentences. If you've been tracking average number of long sentences in your content, you won't see any fluctuation in the average. Even aggregated quality statistics aren’t stored for confidential documents.

Exceptions

Some details about confidential content are still stored on the Core Platform. Here are the exceptions:

ExceptionDetails
All files in the server/www/output directory

Files that are written here still contain full details about confidential content. These files include Scorecards and any other supplementary files such as extracted text files.

However, these files are easy to identify because they contain "_confidential_" in the file name. For example, the file name for a confidential Scorecard looks like this:

udwreomhmjoup7byzvwv2gdl6h_confidential_report.xml

Once you know this convention, your server administrator can create an automated process that deletes these files on a regular basis. How often you want to delete these files is up to you.

The document path and file name.These two data points are still written to the Analytics database. We store general information about each checked document including any custom field that you might have configured such as "project" or "team".
Server Workload Captures

A server workload capture puts many files into one big zip file for debugging purposes. Among these files are extracted text dumps and Scorecards that can potentially contain confidential content. Once someone downloads this zip file, the confidential content is outside of your control.

This is why we've added a warning when anyone finishes a workload capture that contains confidential content. Administrators are warned that the captured data contains confidential content. They also get a suggestion that they might want to start another capture.