Inventory for handling scanned documents (was: Better indexing
on bitsavers)
der Mouse
mouse at Rodents.Montreal.QC.CA
Fri May 20 15:29:59 CDT 2005
> We've now named quite a lot of applications and concepts about how to
> handle scanned documents. I'd like to get the big picture:
> - How do you scan a paper document? Page by page?
Page by page, usually. If it's bound, and two pages fit together on my
scanner, I'll often do it two pages at a time.
> Do you use a script or something like that?
No...though I will use shell history mechanisms to ease the task, each
scan generally involves at least a few keystrokes.
> Do you directly scan b/w, or first use grayscale/colour and then
> degrade that to b/w?
This is a judgement call. Sometimes I'll do colour, sometimes
greyscale, sometimes binary. When I do binary I'll sometimes lett he
scanner do it and sometimes I'll scan it some other way and convert it
by hand.
> - How do you work on the scanned images: Do you cut off the white rim
> as much as possible?
Usually.
> How do you deal with images that are a tad rotated?
Ignore the rotation, usually. Egregious cases I may rescan. In some
cases, I touch up the orientation with pnmrotate (and a subsequent
pnmcut).
> How do you deal with single black dots in white areas or the other
> way around?
If the use is one for which they matter, I edit them out "by hand".
> - What digital format do you like to get when it's all finished?
> Plain PDF? PDF with some bookmarks? PDF with all headings as
> bookmarks? A new PDF-hyperref based index? Multiple
> TIFF/PNG/whatever images? Something like a web-based slide-show?
> ...or multiple formats (web-based for viewing, PDF for printing,
> ...)?
I dislike PDF. I loathe Web-specific stuff.
I generally just keep a directory around with the iamge files.
Sometimes I'll tar it up, sometimes I'll use compression programs like
gzip or bzip2 to reduce the storage requirements.
> - What do you currently use as your software:
> Operating system:
NetBSD (1.4T plus a number of private hacks).
> PDF viewer:
GhostScript (8.30 at the moment).
> TIFF viewer:
tifftopnm, often transformed with pnm tools like pnmscale, with the
resulting p*m file displayed using a picture display program of my own.
> Browser/other viewers you'd love to use:
I've yet to find really *good* such tools, probably because my
user-interface tastes are unusual.
/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML mouse at rodents.montreal.qc.ca
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
More information about the cctalk
mailing list