Glossary

A B C D F I L M O P R S T U

A

ABBYY FineReader document is an object created by the ABBYY FineReader software to process one source document with structure analysis. It contains page images with corresponding recognized text (if any) and program settings (scanning, recognition, saving options, etc.).

ABBYY Hot Folder is a scheduling agent which allows you to select a folder with images and set the time for processing images in this folder. The images from the selected folder will be processed automatically at the specified time.

ABBYY Screenshot Reader is an application to create screenshots and recognize texts in them.

Abbreviation is a shortened form of a word or phrase used to represent the whole. For example, MS–DOS (for Microsoft Disk Operating System), UN (for United Nations), etc.

Activation is the process of obtaining a special code from ABBYY which allows the user to use his/her copy of the software in full mode on a given computer.

Activation code is a code that is issued by ABBYY to each user of ABBYY FineReader 10 during the activation procedure. The activation code is required to activate ABBYY FineReader on the computer that generated the Product ID.

Active area is a selected area on an image that can be deleted, moved or modified. To make an area active, click it. The frame enclosing an active area is bold and has small squares that can be dragged to change the size of the area.

Automatic Document Feeder (ADF) is a device that automatically feeds documents to a scanner. A scanner with an ADF can scan multiple pages without manual intervention. ABBYY FineReader also supports scanning multi-page documents.

ADRT® (Adaptive Document Recognition Technology) is a technology that increases conversion quality of multi-page documents. For example, it can recognize such structural elements as headings, headers and footers, footnotes, page numbering and signatures.

Area is a section on an image enclosed by a frame. Before performing OCR, ABBYY FineReader detects text, picture, table, and barcode areas in order to determine which sections of the image should be recognized and in what order.

Area template is a template that contains information about the size and location of areas within a set of similar-looking documents.

Automation Manager is a built-in manager which allows you to run an automated task, create and modify automated tasks, and delete custom automated tasks which you no longer use.

Back to Top

B

Barcode area is an area that is used for barcode image areas.

Brightness is a scanning parameter that indicates the contrast between black and white image areas. Setting the correct brightness increases recognition quality.

C

Code page is a table that sets the interrelation between the character codes and the characters themselves. Users can select the characters they need from the set available in the code page.

Color mode is a scanning parameter that determines whether an image must be scanned in black and white, grayscale, or color.

Compound word is a word made up of two or more stems (general meaning); a word not found in the dictionary, but potentially made up of two or more terms found in the dictionary (ABBYY FineReader meaning).

D

Document analysis is a process of selecting logical structure elements and different types of areas in a document. Document analysis can be carried out automatically or manually.

Document open password is a password which prevents users from opening a PDF document unless they type the password the author specified.

Document options is the set of options that can be selected in the Options dialog box (Tools>Options). Options sets also include user languages and patterns. Options sets can be saved and then used (loaded) in other ABBYY FineReader documents.

Dots per inch (dpi) is standard of measurement for the resolution of images.

Driver is a software program that controls a computer peripheral (e.g., a scanner, a monitor, etc).

Back to Top

F

Font effects is the appearance of a font (i.e. bold, italic, underlined, strikethrough, subscript, superscript, small caps).

Ignored characters are any non–letter characters found in words (e.g. syllable characters or stress marks). These characters are ignored during the spell check.

Inverted image is an image with white characters against a dark background.

L

License Manager is a utility used for managing ABBYY FineReader licenses and activating ABBYY FineReader 10 Corporate Edition.

Ligature is a combination of two or more "glued" characters (such as fi, fl, ffi). These characters are difficult to separate because they are usually "glued" in print. Treating them as a single compound character improves OCR accuracy.

M

Monospaced font is a font (such as Courier New) in which all characters are equally spaced. For better OCR results on monospaced fonts, select Tools>Options..., click the Document tab, and select Typewriter under Document print type.

O

Omnifont system is a recognition system that recognizes characters set in any font and font size without prior training.

Optional hyphen is a hyphen (¬) that indicates exactly where a word or word combination should be split if it occurs at the end of a line (e.g. "autoformat" should be split into "auto–format"). ABBYY FineReader replaces all hyphens found in dictionary words with optional hyphens.

Back to Top

P

Page layout is the arrangement of text, tables, pictures, paragraphs, and columns on a page, as well as fonts, font sizes, font colors, text background, and text orientation.

Page layout analysis is the process of detecting areas on a page image. Areas can be of five types: text, picture, table, barcode, and recognition area. Page layout analysis can be performed automatically when clicking the Read button, or manually by the user prior to OCR.

Paradigm is the set of all grammatical forms of a word.

Pattern is a set of pairs (each pair contains a character image and the character itself) that is created during pattern training.

PDF security settings are restrictions that can prevent a PDF document from being opened, edited, copied or printed. These settings include Document Open Passwords, Permissions Passwords, and encryption levels.

Permissions Password is a password which prevents other users from printing and editing a PDF document unless they type the password the author specified. If some security settings are selected for the document, other users will not be able to change these settings until they type the password the author specified.

Picture area is an area that is used for image areas that contain pictures. This type of area may enclose an actual picture or any other object that should be displayed as a picture (e.g. a section of text).

Primary form is the form of a headword in a dictionary entry.

Print type is a parameter reflecting how the source text was printed (on a laser printer or equivalent, on a typewriter, etc.). For laser-printed texts, select Autodetect; for typewritten texts, select Typewriter; for faxes, select Fax.

Product ID is the parameter that is automatically generated based on the hardware configuration when activating ABBYY FineReader on a particular computer.

Prohibited characters — if certain characters will never be found in recognized text, they may be specified in a set of prohibited characters in the language group properties. Specifying these characters increases the speed and quality of OCR.

R

Resolution is a scanning parameter that determines how many dpi to use during scanning. Resolution of 300 dpi should be used for texts set in 10pt font size and larger, 400 to 600 dpi is preferable for texts of smaller font sizes (9pt and less).

Recognition area is an area enclosing a section of an image that ABBYY FineReader should analyze automatically.

Back to Top

S

Scanner is a device for inputting images into a computer.

Separators are symbols that can separate words (e.g. /, \, dash) and that are separated by spaces from the words themselves.

T

Table area is an area that is used for table image areas or for areas of text that are structured as a table. When the application reads this type of area, it draws vertical and horizontal separators inside the area to form a table. This area is the rendered as a table in the output text.

Tagged PDF is a PDF document which contains information about the document structure such as its logical parts, pictures, tables, etc. This structure is encoded in PDF tags. A PDF file equipped with the tags may be reflowed to fit different screen sizes and will display well on handheld devices.

Text area is an area that contains text. Note that text areas should only contain single-column text.

Training is establishing a correspondence between a character image and the character itself. (For details, see Recognition with Training section.)

U

Uncertain characters are characters that may have been recognized incorrectly. ABBYY FineReader highlights uncertain characters.

Uncertain words are words containing one or several uncertain characters.

Unicode is a standard developed by the Unicode Consortium (Unicode, Inc.). The standard is a 16–bit international encoding system for processing texts written in the main world languages. The standard is easily extended. The Unicode Standard determines the character encoding, as well as properties and procedures used in processing texts written in a certain language.

Back to Top