Define How You Want Acrolinx To Track Your Content

Content Reference or Content Fingerprint?

Content Tracking helps you follow your content over its entire lifespan. This gives you more accurate and useful analytics.

Acrolinx can track your content in two different ways. The approach you choose will depend on the workflows and content types used in your organization.

The Content Reference lets you define exactly how to identify your content. For example, you could use the document file path or some content-specific metadata. This option gives you fine-grained control, since you can define a different Content Reference for each content type using Content Profiles. It's the ideal setup if you know that your document file paths won't change, or that your content contains some stable and unique metadata.

The Content Fingerprint lets Acrolinx identify your content for you. Acrolinx looks at the written content itself, and doesn't rely on file paths or metadata. This is a global option that applies across all your content types, which makes it great for workflows where content might be copied between different file types. It's also great for workflows where files might move and metadata might change over time.

By default, Acrolinx uses the Content Reference method to track your content, and uses your content's file path as its reference. The rest of this article will walk you thorough how and why you can modify this default behavior.

Content Reference

By default, Acrolinx uses the file path as the unique reference for your content. Basically, what you see as the "Source" in a Scorecard is the default reference. 

If you're happy with this, then you don't need to change anything. But maybe you have a better way of tracking your content. 

Why would you need a better way? Well, consider the following scenario.

The Problem

You're working on a Markdown file called "introduction-to-demo-inc-greeblies.md". For arguments sake, let's say that you moved it to different locations and even changed your mind about the file name just before you published it.

The evolution of your checking statistics would look something like this:

Check TimeFile PathContent ReferenceScoreImprovement
First version/doc/drafts/introduction-to-demo-inc-greeblies.md"/doc/drafts/introduction-to-demo-inc-greeblies.md"65No Previous Data
Revised version/doc/demo-inc/topics/introduction-to-demo-inc-greeblies.md"/doc/demo-inc/topics/introduction-to-demo-inc-greeblies.md"78No Previous Data
Final version/doc/demo-inc/topics/overview-of-demo-inc-greeblies.md"/doc/acme/topics/overview-of-demo-inc-greeblies.md"84No Previous Data

 The last column shows "No Previous Data" because Acrolinx treats each version as a separate file rather than different iterations of the same file.

The Solution

To get around this problem, you can tell Acrolinx to use a different reference. As long as you have an attribute that is stable across content versions, you'll be able to track that content.

For example, the front matter in your Markdown file might contain a "slug" parameter. You've set the slug to "greeblies-intro" and it doesn't change between the different versions. You can configure Acrolinx to use the value for the "slug" parameter as the content reference. 

The evolution of your checking statistics would now look something like this:

Check TimeFile PathContent ReferenceScoreImprovement
First version/doc/drafts/introduction-to-demo-inc-greeblies.md"greeblies-intro"65No Previous Data
Revised version/doc/acme-widgets/topics/introduction-to-demo-inc-greeblies.md"greeblies-intro"78+13
Final version/doc/acme-widgets/topics/overview-of-demo-inc-greeblies.md"greeblies-intro"84+6

The "Improvement" column shows progress because all quality scores are attributed to a single content reference "greeblies-intro".


To Configure the Content Reference, Follow These Steps:

  1. In the Dashboard, navigate to Guidance Settings > Content Profiles, and click on the relevant Content Profile. For our Markdown example above, we'd pick the Markdown Content Profile.
  2. Navigate to DETAILS > Content Reference
  3. Select Use part of the content as content reference and enter the XPath for your stable attribute.
    For example, to use the "slug" parameter in Markdown front matter, your XPath could look like this:

You can test whether your content reference is working by checking a relevant file, then going to the Scorecard Archive dashboard. In the Document column, instead of the file path you should now see the content reference that you defined.

Content Fingerprint

The Content Reference is a great solution for keeping track of your content if you have some reliable attribute that you know will be unique and stable across all iterations of each piece of content. In reality, you might have a workflow where an article starts its life in a word processor like Microsoft Word, is then copied into an XML editor like XMetaL Author, and finally makes its way into a CMS like Adobe Experience Manager. You'd want to see how each step in that process influenced your content quality, but those different versions of your content don't share a file path or any other sort of unique metadata attribute. The only thing linking those versions of your article is the content itself.

Enter the Content Fingerprint. With this method, Acrolinx groups the content that you check based on how similar the written content is. If your content moves location or even format, Acrolinx will recognize that the substance of each article is very similar, and will treat them as different versions of the same article. If there isn't enough content for Acrolinx to make a decision, it will revert to the default Content Reference method.

Switching between these tracking methods fundamentally changes the way Acrolinx classifies your content for Analytics. This can make it difficult to compare between Analytics from before and after you switch methods. To get the most valuable Analytics, we recommend not switching between these methods too regularly.

Because the Content Fingerprint method automatically classifies your content for you, it can potentially result in false positives if you have multiple very similar articles. Before deciding to use this method, it's worth thinking about the nature of your content and deciding whether it’s the best option for you.

To enable a Content Fingerprint for all your content, follow these steps:

  1. In the dashboard, navigate to Analytics > Administration > Content Tracking
  2. Select Content Fingerprint

Once you select Content Fingerprint as the method to track your content, any future checks will track your content using a Content Fingerprint. It doesn't apply to existing check data.