Question about PDF manipulation

Jan-Benedict Glaw jbglaw at lug-owl.de
Thu Jun 2 13:08:52 CDT 2005


On Thu, 2005-06-02 13:35:05 -0400, Barry Watzman <Watzman at neo.rr.com> wrote:

> So what if it's not "searchable".  Get a clue:  THE ORIGINAL PAPER MANUAL
> WAS NOT "SEARCHABLE".  But if you or anyone else wants it to be searchable
> they are certain welcome to OCR the document, a capability which Acrobat
> supports.

That's one of the main points. Somebody who scanned a document spent
some of his *own* time to ease things for other people. He'd give the
PDF around, but this data is only available to the people who get that
damn thing handed over. Having a nice set of scripts to automatically
also OCR the images (text quality doesn't matter--all most important
keywords will show up often enough so that *one* of them will most
probably me correct).

> The people who don't like PDFs either have not used Acrobat extensively or
> don't understand the real nature of the prouduct.  Acrobat allows you to
> create an electronic document that can have as many features (or as few) as
> the creator wants:

We're talking about PDF, not about Acrobat. I yet have to see Acrobat
running on vax-linux ...

> -Page images
> -Searchable, exportable, "copyable" text

Needs to be generated and handled somewhere.

> -Fonts and graphics
> -Printable
> -Fully "rearrangeable" (re-sequence, add, delete, replace pages)
> -Table of Contents (as hyperlink)

Needs to be created explicitely.

> -Index

Dito.

> -Movies, photos, sound and other multimedia objects

Those won't come out of my scanner :)

> -Security control, passwords determining who can do what

This is what I ban to hell.

> And it and the documents it creates are multi-platform:  PC, Apple, Linux,
> Sun, IBM mainframe .... virtually every computer platform in existence.

...but Acrobat isn't.

> It's one of the best and most wonderful tools that the PC world has ever
> created.  Sorry if it's proprietary, but sometimes quality tools are only
> created by people who want to be paid for their work.

Maybe we'll get something even *better* for archiving old documents
and/or old *content* in general (right, think about disk images,
programs, ...)

It's not a tool problem, it's a conceptual thing that first needs some
clever concepts. If we always only try to fix little spots of it, we
won't ever get the big picture right.

MfG, JBG

-- 
Jan-Benedict Glaw       jbglaw at lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 fuer einen Freien Staat voll Freier Bürger" | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));



More information about the cctalk mailing list