Samples

Code Printouts (plain text formatted with spaces)

Situation description: this example has two peculiar features which greatly influence the recognition quality:

  • all left indents are not saved as spaces but by specifying the value of paragraph indents; left indents are not saved in the TXT format; some lines are merged into one paragraph, this paragraph is saved in the TXT format as one text line;
  • too many errors during the recognition of programming language structures.
Code printouts

(listing.tif)

Solution:

  1. FineReader has a special option for the correct recognition of such documents: Plain text formatted with spaces. It indicates that the text is formatted in one column and set in monospaced font of a same size. In the recognized text left indents will be represented as spaces; every line is made a separate paragraph and the original paragraphs will be separated by empty lines. All this helps to retain the original text formatting when saving in TXT format. To set this option:

    • Select the Plain text formatted with spaces item on the Recognition tab of the Options dialog (Tools>Options menu) in the Document type group.
  2. For good recognition of code printouts it is necessary to set a special recognition language. To do this:
    • Select the Choose more languages item in the language list on the Standard toolbar and in the opened Recognition language dialog select the C++ item.

    Note: If code printouts contain some additional text comments, select two recognition languages to read the document correctly: the programming language and the language of text comments.