Automatic Page Layout Analysis Options

As part of the automatic page layout analysis procedure the following types of blocks are drawn: text blocks, table blocks, picture blocks, and barcode blocks.

To start automatic layout analysis (and text recognition) click the 2-Read button. Before clicking this button, however, select the main layout analysis options: document type and table analysis options.

Document type

In most cases text layout is determined automatically. Automatic detection is performed if the Autodetect layout value on the Recognition tab in the Document Type group (Tools>Options menu) is set. Note that the value is set by default. 

To select the document type manually:

Document types available:

Autodetect layout - (set by default) Text layout is determined automatically. Recognition of all text types, including multi-column texts, and texts containing tables and pictures, is performed automatically.  
Single column - The text is formatted into one column. Use this option if automatic page layout analysis incorrectly determines the text type as multi-column.
Plain text formatted with spaces - The text is formatted into one column and set in a monospaced font that is uniform in size throughout. In the recognized text left indents are represented by spaces, each line is made into a separate paragraph, and original paragraphs are separated by means of empty lines. Useful, for example, when recognizing C++ code printouts or old computer printouts.

Table analysis options

In most cases the application divides tables into rows and columns automatically. If additional tuning of table options is required, open the Recognition tab (Tools>Options) and in the Tables group select the nessesary item. Change these options if:

  1. Use the One line of text per cell option if your table has no (or only a few) black separators and there is only one line of text per cell. For example:

    Kilometers Miles
    1 0.62
    5 3.2

    - this table has only one line of text per cell

    Physical phenomenon t, degrees centigrade
    Water boiling point 100
    Water freezing point 0

    - this table has more than one line of text per cell

  2. Use the No merged cells in table option if your table has no merged cells in it. For example:

  3. Temperature
    Degrees centigrade  Degrees Kelvin
    -273 0
    100 373

    - the Temperature cell is a merged cell

Note: Do not select One line of text per cell and/or No merged cells in table options if there are tables with differing structures in your text. Selecting these options may result in errors being made during layout analysis and have an adverse effect on recognition quality.

Drawing and editing blocks manually
Saving a block image to a file