Many things

Jim Leonard trixter at oldskool.org
Mon Jan 31 06:22:13 CST 2005


Eric Smith wrote:
>>That is one of the MISuses of PDF.  PDF should not be used as a container
>>for bitmap images.
> 
> Why?  What better open-standard file format can store a lot of pages using
> lossless bilevel compression?  PDF can store the original bitmaps (well-
> compresses) together with the OCR results, so that you can have
> mostly-searchable files that still look like the original doument.  (As
> opposed to typical OCR files that are completely screwed up and lose
> information.)
> And PDF can support a mix of bilevel and greyscale or monochrome in
> the same document, or even on the same page.

I maintain that PDF should not be used merely as a container for existing 
graphics files because there is normally no easy free way to extract the image 
data and use it in another program.  I know that it *can* do it, but the 
majority of users who do this screw it up massively (I'm thinking 150 DPI JPGs 
of scanned text).

>>In case it wasn't obvious, PDF *is* Postscript!  It's *portable*
>>postscript.
> 
> Speaking as someone who has written software to read and write both
> Postscrpt and PDF, I can tell you in no uncertain terms that PDF is
> NOT Postscript.  PDF happens to use a subset of the Postscript
> imaging model, and has superficially similar syntax in some areas,
> but that's about as close as they get.

I am familiar with the internals of PDF as well, which is why I wrote portable. 
  Portable does not imply complete.  Perhaps I shouldn't have used the "*is*" 
emphasis...

> Since PDF can do the same things, there seems to be little advantage
> to using DjVu instead.

DjVu has other advantages, such as local/window/viewport decoding of images 
with ludicrously high dimensions/resolutions but I understand your point.

Where are the tools to create DjVu-like PDF files?  The best Acrobat can do is 
OCR text but still leave the source bitmap in place...  If I scan in a page 
with a background color image with B&W text foreground, where are the PDF tools 
to properly handle layer seperation?  (Not CMYK seperation, you know what I 
mean :-)
-- 
Jim Leonard (trixter at oldskool.org)                    http://www.oldskool.org/
Want to help an ambitious games project?             http://www.mobygames.com/
Or check out some trippy MindCandy at             http://www.mindcandydvd.com/



More information about the cctalk mailing list