20,046 page doc archive still available

Paul Koning pkoning at equallogic.com
Wed Sep 8 13:54:26 CDT 2004


>>>>> "Joshua" == Joshua Boyd <jdboyd at jdboyd.net> writes:

 Joshua> On Tue, Sep 07, 2004 at 01:59:21PM -0500, John Foust wrote:
 >> I appreciate all the offers to host these files.
 >> 
 >> I was surprised that there were no suggestions about how to
 >> augment these files via OCR or PDF conversion.

 Joshua> Well, there are certainly plenty of ways to go from bitmaps
 Joshua> to PDFs of bitmaps, like the old MIT AI Memo PDFs were done.

 Joshua> A few years ago I tried to OCR some DEC manuals, but I wasn't
 Joshua> happy with the results I got at all.  One program didn't work
 Joshua> very well, and the other only wanted to spit out plain text
 Joshua> without keeping for formatting or diagrams.  They were both
 Joshua> commercial programs.

I have had good success with Adobe's OCR plugin for Acrobat -- free
for the download with a 50 page at a time limit.  (It will do bigger
docs, in 50 page pieces.)  It worked well enough to produce useful
output from a manual full of pictures (a flight manual).

       paul




More information about the cctalk mailing list