PDF-Overlay

From MagnetoWiki
Jump to: navigation, search

PDF-Overlay is a utility program for merging FineReader OCR output with original high resolution PDFs.

STATUS

Current Status: pre-Alpha, FOR TESTING ONLY

Know Issues

  • Page size default to US Letter
  • PDF Meta-data is lost
  • Underlying text position is slightly offset

Motivation

FineReader can produce output with text under the scanned image. However the resulting PDF has only a FAX quality image of the original scanned page - this is unacceptable for archiving documents. PDF-Overlay solves this problem by overlaying the original high-resolution scan on top of FineReader's output.

Download and Source Code

PDF-Overlay Auto Build

PDF-Overlay Source

PDF-Overlay Usage

To operate, drag and drop the original high-resolution PDF onto PDF-Overlay.app. The FineReader output must be in the same folder as the original, and named with FineReader's default naming scheme (i.e. "xxx processed by FineReader.pdf").

PDF-Overlay ignores FineReader and its own output PDFs, so it is safe to simply select-all PDFs and drop them on PDF-Overlay. Only PDFs with matching FineReader output will be processed. PDF-Overlay always overwrites the merged PDF.

Example File Naming

Input Files:

  • 2008-05-27.pdf
  • 2008-05-27 processed by FineReader.pdf

Output File:

  • 2008-05-27 merged by PDF-Overlay.pdf

Recommended FineReader Settings

Since PDF-Overlay over-writes (over-paints really) the FineReader PDF, it is best to setup FineReader to produce small, mainly-text documents with light-weight images. This will minimize the size of the merged PDF.

  • Saving mode: text and pictures only
  • Ask for name before saving: off
  • Picture Quality: Low (for Web)
  • Picture Format: Black and White (CCIT Group 4)