|
|
Code Printouts (plain text
formatted with spaces)
Situation description: this example has two
peculiar features which greatly influence the
recognition quality:
- all left indents are not saved as spaces but by
specifying the value of paragraph indents; left
indents are not saved in the TXT format; some lines
are merged into one paragraph, this paragraph is
saved in the TXT format as one text line;
- too many errors during the recognition of
programming language structures.
(listing.tif) |
Solution:
-
ABBYY FineReader has a special option for
the correct recognition of such
documents: Plain text formatted with
spaces. It indicates that the text is
formatted in one column and set in monospaced
font of a same size. In the recognized text
left indents will be represented as spaces;
every line is made a separate paragraph and
the original paragraphs will be separated by
empty lines. All this helps to retain the
original text formatting when saving in TXT
format. To set this option:
- Select the
Read as plain text formatted with spaces item in the Read group
in the Legacy Options dialog. To open this dialog,
click the Legacy Options... button on the General
tab in the Options dialog (menu Tools > Options
).
- For good recognition of code printouts it is
necessary to set a special recognition
language. To do this:
- Select the Choose more
languages item in the language list on
the Standard toolbar and in the
opened Recognition language dialog
select the C++ item.
Note: If code printouts contain some
additional text comments, select two
recognition languages to read the document
correctly: the programming language and the
language of text comments.
|
|
|