fixing broken .Z files?

Jim Leonard trixter at oldskool.org
Thu Mar 3 22:15:00 CST 2005


der Mouse wrote:
> As a simple example, if the high bit of a compressed codon is 1, the
> next bit is significantly more likely to be 0 than 1 - at least until
> the table fills up.  For a more complex example, consider the Shannon
> estimate of one bit per letter for normal connected English text.  At

With all due respect, we're talking about LZW, not Shannon's english language 
estimate -- I don't think your comments are appropriate in the context of this 
discussion.  Pop on over to comp.compression and I'd be happy to discuss the 
amount of entropy present in human language, but once data gets transformed by 
LZW it has very little entropy left.  (If it didn't, by your argument, you 
could still recompress it with a bitwise encoder like PAQ -- you can't (by a 
significant margin, anyway)).

>>What I think you're getting at is that, since the source data has a
>>lot of redundancy *and* is human parsable, it can be reconstructed
>>more easily in the case of mangled data.
> 
> Well, yeah; "is human parsable" is a form of redundancy, but one that
> is almost impossible for programs to take advantage of - certainly

Actually, WinRK uses a dictionary to currently achieve the very best Calgary 
Corpus score, so it is most definitely exploitable.
-- 
Jim Leonard (trixter at oldskool.org)                    http://www.oldskool.org/
Want to help an ambitious games project?             http://www.mobygames.com/
Or check out some trippy MindCandy at             http://www.mindcandydvd.com/


More information about the cctalk mailing list