Question about PDF manipulation

Paul Koning pkoning at equallogic.com
Thu Jun 2 15:58:10 CDT 2005


>>>>> "Jan-Benedict" == Jan-Benedict Glaw <jbglaw at lug-owl.de> writes:

 >> Ghostscript reads PDF files every bit as well as PS files, and
 >> it's open source...

 Jan-Benedict> You didn't answer my question:-) Consider I prepare a
 Jan-Benedict> TIFF file that contains (with additional tags) eg. some
 Jan-Benedict> raw OCRed text, not read-checked. Now I preapre a PDF
 Jan-Benedict> from this and use gs to get the image back.  Is my text
 Jan-Benedict> still there? Or do I get an image that "looks" almost
 Jan-Benedict> the original, but doesn't contain my extra-data?

Oh.  I didn't know TIFF could do that; I certainly would never store
text in a TIFF file, no more than I would store images in a DOC file.

I have no idea what would happen.

Keep in mind that PDF is a close relative of PostScript.  PostScript
does a nice job of describing bitmap images of any form a commercial
printer is ever going to encounter, but it's not a universal
meta-everything storage format.  PDF also is meant as a final form
document encoding.  So I would expect images to come through exactly
as they were, but random embedded non-image content may well
disappear.

I suppose you could run the experiment and report back...

  paul



More information about the cctalk mailing list