Creating a Searchable PDF File via OCR

Searchable PDF function

When converting scanned original data to PDF format, paste transparent text data into a PDF file, then create a searchable PDF file). This function automatically creates text information from scanned images using OCR character recognition technology.

The following shows the original text sizes that can be recognized by this machine.

When resolution is 200 dpi

  • Japanese: 12 pt to 142 pt

  • European and American languages: 9 pt to 142 pt

  • Asian languages: 20 pt to 142 pt

When resolution is 300 dpi

  • Japanese: 8 pt to 96 pt

  • European and American languages: 6 pt to 96 pt

  • Asian languages: 12 pt to 96 pt

Tips
  • To use this function, an option is required. For details on the required option, refer to [Scan options] .

  • Text data may not be recognized correctly when:
    Text not supported in the MFP is used in the original.
    A language different from the original language is selected.
    The original orientation does not match the text direction when the page orientation is not adjusted automatically.

Creating a searchable PDF file

When sending a PDF file, create a searchable PDF file using OCR character recognition technology.

To create a searchable PDF, select [PDF] or [Compact PDF] as the file type, and select [PDF Detail Setting] - [Searchable PDF]. Then, configure the following settings.

Setting

Description

[ON]/[OFF]

Select [ON] to create a searchable PDF file.

[Language Setting]

Select a language for OCR processing.

Select the language used in the original to correctly recognize text data.

[Adjust Rotation]

Set this option to ON to automatically perform the rotation adjustment for each page based on the direction of text data detected by OCR processing.

When the rotation adjustment is disabled, if the specified original orientation does not match the text direction, text data is not recognized correctly.

[Document Name Auto Extraction]

Set this option to ON to automatically export a character string appropriate as a document name from the OCR character recognition result, and specify it as a document name.

A document name is assigned automatically based on the character recognition result of the first page, date, time, and serial number.

Tips
  • Selecting [Compact PDF] for [File Type] may offer the higher OCR processing speed than [PDF].

  • [Adjust Rotation] is not available when encryption using a digital certificate (digital ID) is enabled together.

  • When [PDF/A] is set to [PDF/A-1a], the searchable PDF setting is not available.

  • If the following language is selected in [Language Setting], the text direction is recognized automatically.
    [Japanese], [Simplified Chinese], [Korean], [Traditional Chinese]

  • When [Language Setting] is selected, if the vertical and horizontal directions are mixed in the same page of an original, they are recognized as either one direction.
    [Simplified Chinese], [Korean], [Traditional Chinese]

Related setting
Related setting (for the administrator)