Recognition with Training

As previously stated, FineReader can read texts set in practically any font regardless of print quality. Consequently, no prior training is normally required before recognition can take place. FineReader, nevertheless, features a number of user pattern training tools.

Train User Pattern mode may come in useful when:

  1. recognizing texts set in decorative fonts;
  2. recognizing texts containing unusual characters (e.g. mathematical symbols);
  3. recognizing large volumes (more than a hundred pages) of texts of low print quality. 

Tip: Use Train User Pattern mode only if one of the above applies. In other cases you may obtain a slight increase in recognition quality, but the time and effort involved will probably outweigh the benefit received.

Pattern training works as follows. One or two pages are recognized in training mode, and, subsequently, a pattern created. FineReader then uses this pattern to aid recognition of the remaining text.

Sometimes two or even three characters may get "glued" together, and FineReader may be unable to enclose each character in an individual frame to separate them. If this proves to be the case (i.e. you cannot move the frame so that it contains only one whole character and no other character parts), you can train FineReader to recognize the whole inseparable character combinations. Examples of character combinations frequently found glued together include ff, fi, and fl. Such combinations are referred to as ligatures.

Notes:

  1. A pattern is only useful in the case of documents that have the same font, font size, and resolution as the document used to create the user pattern.
  2. Each pattern is created for a particular batch. Consequently, if a batch is deleted, its user pattern is also deleted. Patterns can, however, be copied into other batches. To transfer a user pattern to another batch, simply save the batch options in a batch template format file.
  3. If you switch to recognizing texts set in a different font, always disable any user patterns - choose the Do not use user pattern item on the Recognition tab, menu Tools>Options.

To train a user pattern:

  1. Start Train user pattern mode - click the Train user pattern radio button on the Recognition tab, Tools>Options menu, in the Training group. The default pattern name ("Default") will be displayed in the status line. 
  2. Click the 2-Read button.
  3. Train your pattern - recognize one or more pages in Train user pattern mode.
    Trained characters are saved in the default pattern. Once you have completed training the pattern, FineReader will save the pattern (Default.ptn) in the current batch folder.
  4. Edit your pattern.
  5. Deactivate training mode (click the Use user pattern button on the Recognition tab).
  6. Recognize the rest of the text - click the 2-Read button. 

Note:

  1. To create several patterns for the same batch, use the Pattern Editor dialog (click the Pattern Editor button on the Recognition tab or select the Tools>Pattern Editor menu item). Create a new pattern (click the New button in the dialog) and select it (click the Set Active button). Working with a created pattern is no different to working with a default pattern (see steps 1-5). Keep in mind, however, that only one pattern may be active at any one time.
  2. If you've created several patterns for the same batch, the active one will be the pattern that was last created. The active pattern name is displayed in the status bar. To activate another pattern, select the pattern of your choice in the pattern list in the Pattern Editor dialog (Tools>Pattern Editor menu) and click the Set Active button. Then click the Use user pattern button on the Recognition tab, Tools>Options menu, in the Training group.
  3. If the Use built-in patterns option is set, FineReader will read all texts using its built-in patterns and stop only at uncertain characters. If you are training the system to read decorative and/or non-standard fonts (for example, Tibetan) the use of in-built patterns may result in characters being read incorrectly. If the latter occurs, disable the use of in-built-patterns (clear the Use built-in patterns checkbox on the Recognition tab) and train the system to recognize each unknown character it is likely to encounter.