Most content doesn't just contain text, it also contains images, diagrams, and sometimes markup like XML or HTML. However, we're only interested in checking the text, so we need a way to get it out of your documents. The technical term for this process is "text extraction". For more information, learn how to create or configure a Content Profile.
The following diagram visualizes how Acrolinx reads content.