 Character Set News
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.dcom.telecom
Path: cs.utk.edu!emory!europa.eng.gtefsd.com!howland.reston.ans.net
      !spool.mu.edu!telecom-request
Date: 29 Oct 1993 12:20 -0600
Message-ID: <telecom13.725.1@eecs.nwu.edu>
X-Telecom-Digest: Volume 13, Issue 725, Message 1 of 8
From: Rob Slade <roberts@decus.arc.ab.ca>
Subject: Book Review: "The Unicode Standard"

BKUNICOD.RVW  980921

    Addison-Wesley Publishing Co.
    P.O. Box 520
    26 Prince Andrew Place
    Don Mills, Ontario  M3C 2T8
    416-447-5101 fax: 416-443-0948
or
    1 Jacob Way
    Reading, MA   01867-9984
    800-527-5210  617-944-3700
or
    5851 Guion Road
    Indianapolis, IN   46254
    800-447-2226
or
    Unicode, Inc.
    1965 Charleston Road
    Mountain View, CA  94043
    (415) 961-4189   Fax: (415) 966-1637

"The Unicode Standard", U$32.95/C$42.95
<steve@unicode.org> <unicode-inc@unicode.org> <rick_mcgowan@next.com>
 

In the dim and distant past, the late (and generally unlamented) SUZY
Information System was born in Vancouver.  Rather an oddball as far as
online services went, one "feature" was that the programmer had tried
to allow for the use of all of the IBM graphics characters.  This led
to an entirely new field of "smiley" or "emoticon" (emotional icon)
endeavours.  Instead of the usual sideways happy face of the colon,
hyphen and right parenthesis; ":-)"; we were able to use the "Ctrl-A"
alternative of the IBM PC character set.  Having a decimal value of
one, this character is an upright happy face.  This allowed other
expansions, such as Ctrl-A and the right square bracket, which looks
like a face and a telephone handset, and was used (usually in the
"chat" modes) for "I am on the phone."
 
"How nice," I hear you mutter between clenched teeth.  "Can we now get
on with the review?"  Patience, stout nerds.  This *is* the review.
 
As SUZY users, particularly those who had been introduced to computer
communications on the system, moved on to other services or local
bulletin boards, they were usually quite shocked to find that their
favourite symbols no longer worked.  The little diamond (Ctrl-C) would
kill a message on a VAX.  Fidonet users might find that the cute
tagline they had formed from graphics characters completely
disappeared when they sent the message through an Internet gateway.
 
ASCII (the American Standard Code for Information Interchange) is
widely, and mistakenly, believed to define two hundred and fifty-six
characters.

It doesn't.

Furthermore, of the hundred and twenty-eight characters it does
define, many are "control" rather than printable characters.  (The
"card suit" symbols on the IBM PC graphics set are defined as "end of
text", "end of transmission", "enquiry" and "acknowledgement" under
the real ASCII standard.)  In addition, many believe ASCII to be a
universal standard; also not true.  An octet with the decimal value
thirty-five, for example, is the number sign (sometimes called an
"octothorpe") in the United States, but a pound sign (the British
currency) in Britain.  As with most fields of computer endeavor, the
nice thing about standards is that there are so many to choose from.
Many vary only slightly--but they vary.
 
The point is that there are a number of symbols which we commonly
know, but which cannot be consistently displayed on terminals or
printers.  Certain terminals will have certain "international"
character sets, but not all are identical.  Accents and other phonetic
modifiers may be difficult to handle: entire character sets are given
over strictly to accented characters.  (In Canada we are acutely aware
of the problems, with "French" keyboards used at many sites.  On one,
I was having difficulty finding some necessary punctuation marks for
network addressing, and asked a Francophone programmer for help.  "Who
knows," he growled, "I never use the ____ things!")
 
Unicode seeks to address this problem.  Including not only the
variations on the Latin alphabet, Unicode incorporates Greek,
Cyrillic, Hebrew and other alphabets.  It also includes punctuation,
diacriticals, mathematical and scientific symbols and miscellaneous
graphics.  Asian ideographs are also assigned codes.  This is no
longer suitable, of course, for a seven-bit code, and Unicode is based
on a sixteen-bit address space.
 
The book gives some background and plans (chapter one), general
principles and rules for conformance (chapter two).  To comment on
these in any meaningful way would be to rewrite these chapters.  This
is technical material, though not the same technology that computer
types are used to.  Some background study in linguistics would be a
good idea, although it is not strictly necessary to understand and use
the Unicode standard.  There are, however, a wealth of symbols,
punctuation marks and typesetting codes which Unicode gives
standardized access to.  On the other hand, any application which used
the standard in a significant way would likely require a linguistics
background in any case.
 
The bulk of the books (two volumes) is, of course, taken up with the
actual code charts.  (Volume two, in fact, is almost completely
concerned with Han ideographs.  In spite of the recent widespread use
of the English alphabet, this is still the standard written language
of Chinese, Japanese and Korean: CJK in Unicode terminology.)  The
charts are augmented with verbal definitions of the symbols, and with
cross references to similar forms.
 
The Unicode standard is recent.  In comparative terms its current
usage is negligible.  However, it is the defacto standard for broadly
based international character sets.  With the recent rejection of the
proposed ISO thirty-two bit standard, and the recasting of that
standard to follow Unicode's lead, Unicode is a significant factor in
the development of any international applications.
 
copyright Robert M. Slade, 1993   BKUNICOD.RVW  980921
 
(Postscriptum - Unicode Inc. maintains an FTP site at unicode.org
(192.195.185.2).  Some of the mapping tables, and the Han cross
reference lists are available.  Some tables are also available on IBM
PC or Mac compatible floppy disks.)

    http://www.unicode.org/

Permission granted to distribute only with unedited copies of TELECOM
Digest and associated newsgroups/mailing lists.


DECUS Canada Communications, Desktop, Education and Security group newsletters
Editor and/or reviewer ROBERTS@decus.ca, RSlade@sfu.ca, Rob Slade at 1:153/733
DECUS Symposium '94, Vancouver, BC, Mar 1-3, 1994, contact: rulag@decus.ca

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

An older introductory book on this subject is

"Coded Character Sets: History and Development" by C. E. MacKenzie.
Reading: Addison-Wesley, 1980.

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.std.internat,comp.protocols.tcp-ip
Path: utkcs2!emory!samsung!cs.utexas.edu!sun-barr!decwrl!mcnc!uvaarpa!murdoch
Message-ID: <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU>
Date: 10 Apr 91 17:27:56 GMT
References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp>
Sender: usenet@murdoch.acc.Virginia.EDU
Organization: University of Virginia Lines: 60
From: randall@Virginia.EDU (Randall Atkinson)
Subject: Re: universality of Latin-1


John Gilmore originally wrote:
% And my windows all use ISO Latin 1.  If Torbj|rn would send the
% umlauted letter in that standardized character set, it would look right
% in both the States and in Sweden.

In article <1110@sranha.sra.co.jp>, 
	Erik M. van der Poel <erik@srava.sra.co.jp> responded:
>
> Have you ever tried to send yourself a message in Latin-1? Did it
> work? And even if *you* have a reasonable version of sendmail (one
> that doesn't strip the 8th bit), what makes you so certain that
> Torbj|rn's message and anyone else's won't pass through a site that
> *does* strip the 8th bit?

It does work for a fair and ever increasing subset of the Internet.
BITNET doesn't do very well with it.  Clearly we need to move towards
8-bit and 16-bit and 32-bit transparent mail-transport mechanisms.
Fortunately there are a number of possible transport mechanisms out
there to choose from, some of which are already 8-bit transparent.

> Also, what's so "standardized" about ISO Latin-1? What makes it more
> standard than, say, Latin-2?

ISO 8859/1 is NOT any "more standard" than ISO 8859/2, however sites
in the US are in fact migrating towards ISO 8859/1 from US ASCII and
most sites in the US are NOT migrating towards ISO 8859/2 (though they
might support it on the side as vendors begin to).  The languages that
are most commonly used in the US are in ISO 8859/1 and the languages
supported by ISO 8859/2 are less commonly used (again in the US as a
whole).  

Note that ISO Latin-1 is ISO 8859/1 which is the 8-bit character set
used for Western European languages.  ISO Latin-2 is ISO 8859/2 which
is the 8-bit character set for Eastern European languages.

Clearly we need to add additional information to the header of mail 
messages to indicate which character set to use.  I'm not sure of
the current state of the Internet protocols (RFC 822 et. al.) with
respect to this.  If there isn't the equivalent of a "Character-set:"
header yet, serious consideration should be given to adding one with
clearly defined values for at least existing ANSI and ISO character
sets.

    [ARCHIVER'S NOTE:  the Multipurpose Internet Mail Extensions (MIME)
     protocol defines character-set-selection headers for SMTP e-mail.
     See the Internet standards RFC1521, RFC1523, and RFC1425.]

Character sets that should have a defined string to use with such a
header field include at least:

	ASCII
	ISO 8859/1 
          ...
        ISO 8859/N  (where N is the last defined set)
        ISO 10646   (once it gets completed)

The Internet is the dominant mail transport network at present, partly
because so many other networks gateway with it.  Getting the Internet
to convert to supporting such needs would be a big step in the right
direction.  Perhaps someone on the IETF can comment on their current
activities in this area ??

Ran Atkinson
randall@Virginia.EDU

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.std.internat,comp.protocols.tcp-ip
Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!sun!amdcad!dgcad
      !dg-rtp!chutney!eliot
Message-ID: <1991Apr12.124741.11555@dg-rtp.dg.com>
Date: 12 Apr 91 12:47:41 GMT
References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp>
            <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU>
Organization: Data General Corporation, Research Triangle Park, NC
From: eliot@chutney.rtp.dg.com (Topher Eliot)
Subject: Re: universality of Latin-1

In article <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU>,
             randall@Virginia.EDU (Randall Atkinson) writes:
|>
|> In article <1110@sranha.sra.co.jp>, 
|> 	Erik M. van der Poel <erik@srava.sra.co.jp> responded:
|> >Have you ever tried to send yourself a message in Latin-1? Did it
|> >work? And even if *you* have a reasonable version of sendmail (one
|> >that doesn't strip the 8th bit), what makes you so certain that
|> >Torbj|rn's message and anyone else's won't pass through a site that
|> >*does* strip the 8th bit?
|> It does work for a fair and ever increasing subset of the Internet.
|> BITNET doesn't do very well with it.  Clearly we need to move towards
|> 8-bit and 16-bit and 32-bit transparent mail transport mechanisms.


I expected to see someone else post a more authoritative answer, but since
none has been forthcoming, I will venture.  The folks who work on such things
have been considering the 8-bit, different-codeset issues, as part of a much
larger picture of including such things as graphics and other binary
information in mail.  Since those are harder problems, it means that they
won't have solutions all that quickly.  There is a mailing list on this
subject; if you really need it I can probaly dig out a lead on how to get
onto that mailing list.

|> Fortunately there are a number of possible transport mechanisms out
|> there to choose from, some of which are already 8-bit transparent.

Ack!  "Fortunately"?  There is an ancient curse:  "may you live in interesting
times".  I think it's modern equivalent is "may you have many standards to
choose from".  

-- 
Topher Eliot                           Data General DG/UX Internationalization
(919) 248-6371        62 T. W. Alexander Dr., Research Triangle Park, NC 27709
eliot@dg-rtp.dg.com                           {backbone}!mcnc!rti!dg-rtp!eliot
Obviously, I speak for myself, not for DG.

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.misc
Path: utkcs2!emory!sol.ctr.columbia.edu!spool.mu.edu!agate!sunkist.berkeley.edu
Message-ID: <1991May29.000449.19048@agate.berkeley.edu>
Date: 29 May 91 00:04:49 GMT
References: <10599@castle.ed.ac.uk>
Reply-To: raymond@math.berkeley.edu (Raymond Chen)
In-Reply-To: eanv20@castle.ed.ac.uk (John Woods)
From: raymond@math.berkeley.edu (Raymond Chen)
Subject: Re: Name that character! (definitive list)

Why does everyone feel compellet to post their favorite pronunciations?

In article <10599@castle.ed.ac.uk>, eanv20@castle (John Woods) writes:
>I wonder if there is a definitive list...

Indeed there is.  It used to be part of the comp.unix.questions
Frequently Asked Questions file, but it has since moved into the
`Jargon File'.  Many thanks to Maarten Litmath for maintaining
the USENET ASCII Pronunciation Guide for many years.  (Though the
list below does seem to be missing some of the cleverer names
in Maarten's list.  Like `Donald Duck' for `&'.)

<ASCII> [American Standard Code for Information Interchange] /as'kee/
   n. Common slang names for ASCII characters are collected here.  See
   individual entries for <bang>, <close>, <excl>, <open>, <ques>,
   <semi>, <shriek>, <splat>, <twiddle>, <what>, <wow>, and <Yu-Shiang
   whole fish>.  This list derives from revision 2.2 of the USENET
   ASCII pronunciation guide.  Single characters are listed in ASCII
   order, character pairs are sorted in by first member.  For each
   character, "official" names appear first, then others in order of
   popularity (more or less).

!
     exclamation point, exclamation, bang, factorial, excl,
     ball-bat, pling, smash, shriek, cuss, wow, hey, wham

"
     double quote, quote, dirk, literal mark, rabbit ears

#
     number sign, sharp, crunch, mesh, hex, hash, flash, grid,
     pig-pen, tictactoe, scratchmark, octothorpe, thud

$
     dollar sign, currency symbol, buck, cash, string (from
     BASIC), escape (from <TOPS-10>), ding, big-money, cache

%
     percent sign, percent, mod, double-oh-seven

&
     ampersand, amper, and, address (from C), andpersand

'
     apostrophe, single quote, quote, prime, tick, irk, pop,
     spark

()
     open/close parenthesis, left/right parenthesis,
     paren/thesis, lparen/rparen, parenthisey, unparenthisey,
     open/close round bracket, ears, so/already, wax/wane

*
     asterisk, star, splat, wildcard, gear, dingle, mult

+
     plus sign, plus, add, cross, intersection

,
     comma, tail

-
     hyphen, dash, minus sign, worm

.
     period, dot, decimal point, radix point, point, full stop,
     spot

/
     virgule, slash, stroke, slant, diagonal, solidus, over, slat

:
     colon

;
     semicolon, semi

<>
     angle brackets, brokets, left/right angle, less/greater
     than, read from/write to, from/into, from/toward, in/out,
     comesfrom/ gozinta (all from UNIX), funnel, crunch/zap,
     suck/blow

=
     equal sign, equals, quadrathorp, gets, half-mesh

?
     question mark, query, whatmark, what, wildchar, ques, huh,
     hook

@
     at sign, at, each, vortex, whorl, whirlpool, cyclone, snail,
     ape, cat

V
     vee, book

[]
     square brackets, left/right bracket, bracket/unbracket,
     bra/ket, square/unsquare, U turns

\
     reversed virgule, backslash, bash, backslant, backwhack,
     backslat, escape (from UNIX), slosh.

^
     circumflex, caret, uparrow, hat, chevron, sharkfin, to ("to
     the power of"), fang

_
     underscore, underline, underbar, under, score, backarrow

`
     grave accent, grave, backquote, left quote, open quote,
     backprime, unapostrophe, backspark, birk, blugle, back tick,
     push

{}
     open/close brace, left/right brace, brace/unbrace, curly
     bracket, curly/uncurly, leftit/rytit, embrace/bracelet

|
     vertical bar, bar, or, or-bar, v-bar, pipe, gozinta, thru,
     pipesinta (last four from UNIX)

~
     tilde, squiggle, approx, wiggle, twiddle, swung dash, enyay

   Some other common usages cause odd overlaps.  The ``$'', ``#'', and ``&''
   chars, for example, are all pronunced `hex' in different
   communities because various assemblers use them as a prefix tag for
   hexadecimal constants (in particular, $ in the 6502 world and & on
   the Sinclair and some other Z80 machines).


   ................................................
   ARCHIVER'S NOTE
   The jest about Donald Duck comes from the name
   used for this Disney character in Denmark:
   "Anders And".
   ................................................

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.std.internat
Path: utkcs2!emory!att!bu.edu!wang!ice
Message-ID: <b713vk.j3d@wang.com>
Date: 14 Jun 91 22:02:07 GMT
References: <5565@mrmarx.UUCP>
Organization: Addictive Technologies and Various Magick
From: ice@wang.com (Fredrik Nyman)
Subject: Re: HELP requested on internationalization

sgh@mrmarx.msc.com (Satyen Harve) writes:
>
>I have just been given the responsibility of coming up with a
>plan to internationalize our product.  As a first step, I have
>to identify all the issues that are involved and determine
>their impact on our product.  I would very much appreciate
>hearing from someone who has gone through or is going through
>this process.

>I'd particularly like to get any tips or information on what
>all is involved and where to go to read more about it.  We are
>hoping to address both Europe and Asian markets.

I'd like to suggest that you get:

	"Digital Guide to Developing International Software"
	from Digital Press.
	Order # EY-F577E-DP
	ISBN # 1-55558-063-7
	
The book is geared towards the DEC platforms and the various
libraries available to VMS, Ultrix and DECwindows programmers.

Even if you couldn't care less about these platforms, the book is very
valuable.  Among other things, it describes common character sets and
has quite extensive guidelines fort dealing with internationalization
which are valid no matter what platform you're using.

DEC can be reached at 1-800-DIGITAL if you want to order this manual.
Outside the US, in New Hampshire, Alaska and Puerto Rico: 1-603-884-6660
-- 
Fredrik Nyman [Surgically Enhanced Cyberdweeb]  <ice@wang.com>  DoD #0328
Global Adaptation Center, Wang, M/S 019-490,    NeXT: <ice@red-zinger.wang.com>
One Industrial Ave., Lowell MA 01851, USA       BITNET: <ice@drycas>


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.os.vms
Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!west!texsun!smunews!txsil!danmc
Message-ID: <475@txsil.lonestar.org>
Date: 15 Jun 91 23:00:32 GMT
References: <199113.1053.9712@canrem.uucp>
Distribution: comp.os.vms
Organization: Summer Institute of Linguistics, Dallas
From: danmc@txsil.lonestar.org (Dan McDonald)
Subject: Re: vt3xx soft fonts??

In article <199113.1053.9712@canrem.uucp>
    "jonathan harley" <jonathan.harley@canrem.uucp> writes:
>
>Do you know of any available packages that provide VT3xx (or better)
>downloadable soft fonts to emulate the IBM PCs graphics character set?

As for ones that emulate the IBM PC'sm no, but I would probably only take a
couple of hours to make it - there are only 128 characters to set up. 

>
>If so, where might I obtain the soft fonts, how much $ etc.
>

I wrote a program (in DCL - my favorite programming language) that would take 
bitmaps in a form like:
A 65
1    X
2   X X 
3  X   X
4 X     X
5 XXXXXXX
6 X     X 
7 X     X
8

and would convert them to the down-line loadable format.  I use it mainly
when I need to design another International Phonetic Alphabet softfont
for someone writing a thesis around here.

If you would like code and an example of how to use it, send me e-mail and I
will be happy to dig it up and send it to you.

******************************************************************************
Dan McDonald                    * UUCP      ...utafll!txsil!dalsil!mcdonald
Summer Institute of Linguistics * Internet  mcdonald@dallas.sil.org
Dallas Computer Services        *   -OR-    danmc@txsil.lonestar.org
7500 W Camp Wisdom Rd           * SILnet    DAN.MCDONALD@A1@DALLAS
Dallas, TX 75236                * POTSnet   (214)709-3389
USA                             * FAXnet    (214)709-3387

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.fonts
Path: cs.utk.edu!ornl!fnnews.fnal.gov!mp.cs.niu.edu!news.ecn.bgu.edu!wupost
      !howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!pacbell.com!pacbell
      !boo!seer!ariel
Summary: Hungarian alphabet is Latin alphabet
Message-ID: <1993Apr22.153120.2440@seer.gentoo.com>
Date: Thu, 22 Apr 1993 15:31:20 GMT
References: <1993Apr21.150237.1930@wheaton.wheaton.edu>
Organization: Brad Lanam,  Walnut Creek, CA
From: ariel@seer.gentoo.com (Cathy Hampton)
Subject: Re: Hungarian Keyboard Layout


The Hungarian language, or Magyar, uses the Latin alphabet.  If no one here
responds by tomorrow with the keyboard layout, I have it at home in one of
language books, I think.  (I lived in Vienna for quite a while and learned
a little Hungarian.)

Catherine Hampton
================================================================
Compuserve: 71601,3130       GEnie: ARIEL         GEnie: AMNESTY
Internet: ariel@seer.gentoo.com    Internet/IGC: cah@igc.apc.org
================================================================

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Article 8620 of comp.fonts:
Path: cs.utk.edu!ornl!fnnews.fnal.gov!lll-winken.llnl.gov!uwm.edu!wupost
      !howland.reston.ans.net!ira.uka.de!Germany.EU.net!news.netmbx.de
      !mailgzrz.TU-Berlin.DE!fub!spoolbag.in-berlin.de!rainbow.in-berlin.de
      !rainbow.in-berlin.de!not-for-mail
From: rj@rainbow.in-berlin.de (Robert Joop)
Newsgroups: comp.fonts
Subject: Re: Latin 1 and Latin 3?
Date: 24 Apr 1993 04:07:43 +0200
Lines: 68
Message-ID: <1ra7df$pg0@rainbow.in-berlin.de>
References: <1993Apr22.115504.17537@news.columbia.edu>
NNTP-Posting-Host: rainbow.in-berlin.de
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit

pcj1@cunixf.cc.columbia.edu (Pierre Jelenc) writes:

>I am looking for the assignments of characters to bytes in the Latin 1
>and Latin 3 character sets. In particular, I am concerned with the
>discrepancies between the tables found in DOS and windows manuals and the
>actual Latin 1 character set, and with the differences between Latin 1 and
>Latin 3.

from rfc1345 (Character Mnemonics & Character Sets):

[...]
  &charset ISO_8859-1:1987
  &rem source: ECMA registry
  &alias iso-ir-100
  &g1esc x2d41 &g2esc x2e41 &g3esc x2f41
  &alias ISO_8859-1
  &alias ISO-8859-1
  &alias latin1
  &alias l1
  &alias IBM819
  &alias CP819
  &code 0
  NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI
  DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US
  SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
  At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _
  '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT
  PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3
  DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC
  NS !I Ct Pd Cu Ye BB SE ': Co -a << NO -- Rg '-
  DG +- 2S 3S '' My PI .M ', 1S -o >> 14 12 34 ?I
  A! A' A> A? A: AA AE C, E! E' E> E: I! I' I> I:
  D- N? O! O' O> O? O: *X O/ U! U' U> U: Y' TH ss
  a! a' a> a? a: aa ae c, e! e' e> e: i! i' i> i:
  d- n? o! o' o> o? o: -: o/ u! u' u> u: y' th y:
[...]
  &charset ISO_8859-3:1988
  &rem source: ECMA registry
  &alias iso-ir-109
  &g1esc x2d43 &g2esc x2e43 &g3esc x2f43
  &alias ISO_8859-3
  &alias ISO-8859-3
  &alias latin3
  &alias l3
  &code 0
  NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI
  DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US
  SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
  At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _
  '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT
  PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3
  DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC
  NS H/ '( Pd Cu ?? H> SE ': I. S, G( J> -- ?? Z.
  DG h/ 2S 3S '' My h> .M ', i. s, g( j> 12 ?? z.
  A! A' A> ?? A: C. C> C, E! E' E> E: I! I' I> I:
  ?? N? O! O' O> G. O: *X G> U! U' U> U: U( S> ss
  a! a' a> ?? a: c. c> c, e! e' e> e: i! i' i> i:
  ?? n? o! o' o> g. o: -: g> u! u' u> u: u( s> '.
[...]

the mnemonics are explained in the rfc. rfc's can be found on many ftp sites.

rj
-- 
__________________________________________________
Robert Joop
  rj@{rainbow.in-berlin,fokus.gmd,cs.tu-berlin}.de
  s=joop;ou=fokus;ou=berlin;p=gmd;a=dbp;c=de

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: bit.listserv.win3-l
Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
Path: cs.utk.edu!darwin.sura.net!newsserver.jvnc.net!news.cac.psu.edu!psuvm
      !auvm!LUCS-01.NOVELL.LEEDS.AC.UK!ECL6TAM
Return-Path: <@AUVM.AMERICAN.EDU,@VTBIT.BITNET:WIN3-L@UICVM.BITNET>
Via:        UK.AC.LEEDS.GPS;  2 JUL 93  8:56:26 BST
Message-ID: <MAILQUEUE-101.930702085423.320@lucs-01.novell.leeds.ac.uk>
Date:         Fri, 2 Jul 1993 08:54:23 GMT
Reply-To:     T.A.McAllister@mailer.leeds.ac.uk
Sender:       Microsoft Windows Version 3 Forum <WIN3-L@UICVM.BITNET>
From:         Alec McAllister <ECL6TAM@LUCS-01.NOVELL.LEEDS.AC.UK>
Subject:      Re: Foreign language keyboards (German)


Apologies if you already seen this. It was returned, implying that it
had never reached the list.


>Date:          Thu, 1 Jul 1993 16:52:25 GMT
>From:          Alec McAllister <ECL6TAM@LUCS-01.NOVELL.LEEDS.AC.UK>
>Subject:       Re: Foreign language keyboards (German)
>
>>Date:          Thu, 1 Jul 1993 10:16:50 -0500
>>From:          Brian Madsen <bmadsen@TRIBBLE.VORTECH.COM>
>>Subject:       Foreign language keyboards (German)
>>
>>I occasionally use Windows for writing in German, and when I do, I switch
>>the keyboard definition from US to German.  This makes it lots easier to
>>get at German foreign language characters (double ss's, umlauts, etc.)
>>
>
>There's a better way. There's a piece of shareware called WinGreek.
>That includes a program called Beta which "watches" your keyboard and
>substitutes accented characters if you type certain combinations of
>keys, e.g. if you type u followed by the plus-key on the numeric
>keypad, Beta substitutes ANSI character 0252, u-umlaut. Similarly,
>typing A followed by the plus-key makes Beta substitute ANSI 0196, A-
>umlaut. The accents used in French, Spanish etc are just as quick and
>easy to obtain.
>
>WinGreek and Beta work with any Windows product, not just word
>processors.
>
>Beta plus a single font, the Times New Roman that comes with Windows,
>can produce text in every major European language except Welsh (there
>are no w-circumflex or y-circumflex characters).
>
>The beauty of this system is that you only have to learn one set of
>special keys for all the languages:
>/ = acute,
>* = grave,
>- = circumflex,
>+ = umlaut,
>tilde = tilde (Hurray!) and
>the vertical gapped line = everything else (e.g. s followed by that
>character gives you the German SZ that looks like a capital B, but A
>followed by that character gives you the A with a ring above it which
>is used in Scandinavian languages).
>
>WinGreek also gives you a superb Greek font with all the accents and
>breathing-marks, a Hebrew font with (limited) right-to-left
>processing, and even a font for Coptic.
>
>WinGreek is on archives such as CICA, but the authors are on email. I
>can send their address if anyone is interested.
>
>.

Alec McAllister
Arts Computing Development Officer
Computing Service
University of Leeds
LS2 9JT
tel 0532 335399

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Path: cs.utk.edu!gatech!howland.reston.ans.net!Germany.EU.net!news.dfn.de
      !news.rwth-aachen.de!urmel.informatik.rwth-aachen.de!fangorn!michael
Date: Tue, 10 Jan 95 19:06:02 MET
Organization: An old and gray machine, somewhere in Moria.
Message-ID: <9501103491@fangorn>
References: <3ekfe6$3ed@news1.shell>
NNTP-Posting-Host: akela.informatik.rwth-aachen.de
From: Michael Haardt <u31b3hs@pool.informatik.rwth-aachen.de>
Subject: Re: What is a lantern symbol...

kshaw@shell.portal.com (kendall thomason shaw) writes:
>                                                            My question
> is what similar symbols might there be in McDOS code pages 850 or 437
> for the following symbols:
> 
> 	lantern symbol
> 	checker board (stipple)
> 	board of squares
> 	scan line 1
> 	scan line 9
> 	plus

I don't know about DOS, but the characters look as following:

checker board:

# # # # #
 # # # #
# # # # #
 # # # #
# # # # #

scan line 1 is a horizontal line at the top of a character, scan line 9
is a horizontal line at the bottom of a character.  A vt100 has various
such horizontal lines.

plus is indeed a big cross, like used in conjuction with the corner and
line symbols.  

lantern and board of squares I can not tell you right now, my vt100 is
at home.  It may be that it does not have them, at least the wyse 60 I
am using does not have those in its emulation.  The mapping characters
are very closely connected to the vt100 and the AT&T4410.

> And then I am still (of course?) baffled by the acsc/ac capability
> syntax. Am I to put an octal escape for the literal character there?
> (after the corresponding character expected, e.g. \305 for center line
> drawing criss-cross type symbol? 

Yes, indeed you can do it that way.  I used it a few years ago with
Minix.  ac=n\305 would map n to such a cross for native PC fonts.

Michael
--
Twiggs and root are a wonderful tree (tm) Twiggs & root 1992 :-)
d? H- s(+)/(-) g! au a- w v(---) C++(+++) UL++++S++++?++++ L++ 3 E-
                N+++ tv b+ e+ h f+ m@ r++ n@ y+

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.unix.programmer,comp.terminals
Path: cs.utk.edu!gatech!howland.reston.ans.net!pipex!sunic!news.funet.fi
      !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta
Followup-To: comp.terminals
Date: 16 Jan 1995 13:20:24 GMT
Organization: Finnish Meteorological Institute (FMI)
Lines: 36
Message-ID: <3fdrqo$ca4@kronos.fmi.fi>
References: <snakec.789625204@larry>
NNTP-Posting-Host: dionysos.fmi.fi
In-Reply-To: Article <snakec.789625204@larry> of Ryan Groth
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Re: ASCII CODES > 127 under VT100/ANSI & CURSES

[ Folloups to comp.terminals ]

snakec@larry.wyvern.com (Ryan Groth) writes comp.unix.programmer:
|
|I am writing a few application under SCO unix (AT&T System V, POSIX...)
|using curses. I would like to use line drawing characters in the application
|which I am positive my terminal supports. I do not want to use the box()
|function however. If I addstr() with line drawing characters in the string I
|get M's and D's on the screen. Box does draw lines. Is there a way to use
|addstr() and send line characters? I am positive that my application will

These line drawing characters are from different character set:
Usual assingment (with curses and VT100) may be:

Bank G0		US-ASCII		Assigned with ESC ( B
Bank G1		Special Graphics	Assigned with ESC ) 0

Selecting bank G0 for characters 32-127 with SI
Selecting bank G1 for characters 32-127 with SO

ESC is 0x1B or Ctrl-[
SI  is 0x0F or Ctrl-O
SO  is 0x0E or Ctrl-N

As you can see drawing of line characters don't be so simple (VT100
DON'T use characters > 127 -- VT100 don't support them). You can't do
it with addstr() only, because task includes charcter set assigments also.

(If terminal supports 8-bit characters you perhaps can assing Special Graphics
 to bank G1 and select bank G1 for characters 128-255 with ESC ~
 I however don't be sure that this Special Graphics characters are duplicated
 to upper range -- perhaps they are. )

--
- Kari E. Hurtta                             /  Elm on monimutkaista
  Kari.Hurtta@Fmi.FI			     puh. (90) 1929 658
  {hurtta,root,Postmaster}@dionysos.fmi.fi

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.lang.cobol, comp.terminals, comp.unix.aix, comp.periphs
Date: 5 Sep 1996 14:46:07 -0400
From: Richard Shuford <shuford@cs.utk.edu>
Subject: Re: terminfo files for AIX 2.3

In article <322EB2E4.594F@lincsys.com>,
 Jim Egerton <jegerton@lincsys.com> writes:
>
> Anyone have terminfo files (xterm, vt100, aixterm) that work
> with the Microfocus toolbox on AIX?
>
> Using the files shipped with Microfocus V.3.2.37 I have tried
> using an aixterm as well an xterm with TERM set to xterm and
> vt100.  With the aixterm or xterm and TERM=xterm, the video
> didn't work properly (line's were displayed as qqqqqq).


The display of a row of "qqqqqqq..." is a symptom of the client
application wanting to use the DEC Line-Drawing Character Set, which
is built into VT100s, VT320s, and any other DEC-like terminal built
since 1980.  With the proper character set mapped into the "alternate"
character set, and if the terminal (or emulation) properly honors
codeset switching, a horizontal line is displayed, instead of
"qqqqqqqq...".

(By the way, this is *not* the same as DEC's "advanced video option",
or AVO.  AVO on a VT100 gave you 24-line-by-132-column mode and the
full four video attributes: underline, reverse, bold, & blink. Later
DEC terminals had support for this as standard.)


> With an xterm and TERM=vt100, the video is great (appears to use
> the vt100 graphics character set to draw frames), but the
> function keys didn't work.

You don't say what kind of keyboard you are using.  Makes a difference.


> After copying the terminfo files to a local directory,
> pointing COBTERMINFO and TERMINFO at the local directory, and
> running the .src files through tic, the situation improved
> slightly.  The video for the aixterm and xterm with
> TERM=xterm is better, but frames are drawn using +---+
> instead of the vt100 graphics characters.

A reasonable thing to do, if the client cannot be certain that your
xterm emulation supports the line-drawing characters.


> I was able to  modify the kf1 settings in vt100.src so that the function
> keys are recognized, but the frames are drawn the same as
> with the aixterm and xterm with TERM=xterm.
>
> I also pulled the example vt100 file from the Microfocus
> Cobol home page and tried using this with an xterm.  Same
> results--no advanced video.
>
> If anyone has any terminfo files that appear to work in this
> environment, or online documentation for the settings of sgr,
> sgr0, enacs, rmacs, and acsc I'd really appreciate it.


The global master database for terminfo and termcap descriptions
is now maintained by Eric S. Raymond and is available from:

    http://www.ccil.org/~esr/ncurses.html


 ........................................
 Addendum:  the master terminfo/termcap
 files contain a "klone+acs" entry that
 tries to use the line-drawing characters
 from the IBM PC alternate character set.
 This might work with any Intel console.
 ........................................


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.terminals
Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news-res.gsl.net
      !news.gsl.net!news.mathworks.com!newsfeed.internetmci.com!demos
      !news.uni-stuttgart.de!uniol!uni-erlangen.de!lrz-muenchen.de
      !news.rz.uni-passau.de!
Message-ID: <32102731.87@fmi.uni-passau.de>
Organization: University of Passau, Germany
Date: Tue, 13 Aug 1996 08:56:49 +0200
From: Martin Ramsch <ramsch@fmi.uni-passau.de>
To: Mike Ching <oeefcu@mail.pixi.com>
X-Mailer: Mozilla 3.0b5 (X11; I; SunOS 5.5 sun4u)
References: <NEWTNews.839875334.4580.oeefcu@pixiuser.pixi.com>
NNTP-Posting-Host: 132.231.20.18
Lines: 35
Subject: Re: I want lines, not q's!

Mike Ching wrote:
> 
> I'm trying to write a VT-100/ANSI terminal emulator in QuickBasic, but
> I'm getting a bunch of
> 
>     qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
> 
> where there are supposed to be horizontal lines.  How am I supposed to
> recognize when there are supposed to be lines instead of q's?  I've
> noticed even some commercial programs with the same problem.


I don't know exactly about VT-100/ANSI, but xterm's behaviour should be
quite similiar (BTW, what are the differences?).

What you observe is the switching between charsets:

  Control-N (SO, Shift Out): Switch to Alternate Charater Set:
                             invokes the G1 character set

  Control-O (SI, Shift In):  Switch to Standard Character Set:
                             invokes the G0 character set (the default)

To character sets G0 and G1 actually refer is controlled by
  ESC ( <char> : Designate G0 Character Set
                 ESC ( B   = Unites States (USASCII)
                 ESC ( 0   = DEC Special Character and Line Drawing Set

  ESC ) <char> : Designate G1 Character Set
                 ESC ) B   = Unites States (USASCII)
                 ESC ) 0   = DEC Special Character and Line Drawing Set

I guess as default G0 should refer to USASCII and G1 to the Line Drawing
Set.


So, in a nutshell, you have to pay attention to these code sequences!

See <URL: http://www.uni-passau.de/~ramsch/WWW/ctlseqs.ps > and
    <URL: http://www.uni-passau.de/~ramsch/WWW/xtermcharset.gif >

-- 
Sincerly/Mit freundlichen Gruessen
   Martin Ramsch <m.ramsch@ieee.org>
Inbox/Fax: 02561/91371-6364
<URL: http://www.uni-passau.de/~ramsch/ >


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.os.linux.development,comp.terminals
Followup-To: comp.terminals
Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!swrinde!pipex!sunic
 !sunic.sunet.se!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta
Date: 7 Apr 1995 07:21:53 GMT
Organization: Finnish Meteorological Institute (FMI)
Message-ID: <3m2p6h$kll@kronos.fmi.fi>
In-Reply-To: Article <3bjdl0$lfd@nyx10.cs.du.edu> of Colin Plumb
References: <784.2EDBB0B0@purplet.demon.co.uk> <aeb.786191379@news.cwi.nl>
            <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi>
            <3bjdl0$lfd@nyx10.cs.du.edu>
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: 8-bit charset in C1-C3 banks
  (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...)



[ This is comment to very old article from my archive :-) ]

colin@nyx10.cs.du.edu (Colin Plumb) writes in comp.terminals:

|I just went through RFC 1345 and the CCITT Red Book Recommendation T.51.

|It seems that the standard escape sequence looks like:
|CSI P P P ... P I...I F

|Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?)
|I are magic modifier flags that can totally change the meaning of the escape
|sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./)
|And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies
|what the escape sequence is all about.

|The parameters P are decimal numbers separated by semicolons in the usual
|way.  An all-zero field is synonymous with an empty field.  Trailing empty
|fields and the separating semicolons can be stripped.  Using a colon (:)
|is reserved for future standardizatoin.  If the parameters start with any
|of 0x3C..0x3F (<=>?), it's private-use.

|The top bit is ignored if set, although it's not supposed to be, in all
|the arguments.

|(That is taken from ISO 6429.  It also says that F in the range of 0x70..0x7E
|is not to be standardized, but is for experimental use.)

|This applies to CSI, also known as ESC [.  However, some of the ESC sequences
|described below also seem to use a similar pattern, although the last
|group of final characters isn't reserved and none of the sequences discussed
|here have parameters.

|As I understand it, you have two control sets available, C0 and C1.
|Characters from 0..0x1F are in C0, and 0x80..0x9F are in C1.  In case you
|can't send 8-bit characters, ESC-@ through ESC-_ are synonyms for
|128 through 159.  (ESC-x means x+64, for 64 <= x < 96.)

|You can select a C0 set with ESC ! F, where F is one of the final
|characters discussed above, and a C1 set with ESC " F.


|There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F).
|You can have 4 of these floating around, G0, G1, G2 and G3.  The 0x20..0x7F
|and 0xA0..0xFF ranges are available to have these sets mapped into them.
|When you see a "0x3F", for example, you have to figure out which set (G0,
|G1, G2 or G3) is mapped into that space, and then figure out which character
|set is in force there.

|It's a bit like the 4 segment registers on the 8086.

|94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F.
|These are the G0..G3 slots, respectively.  There's also an overflow range
|which is used, ESC ( ! F, etc.

|96-character sets can only be mapped to the G1..G3 slots.  That uses
|ESC - F, ESC . F and ESC / F.  The "F" assignments are independent of
|the assignments for the 94-character sets.

|I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in
|0xA0..0xFF, but I'm not finding it documented.

|Anyway, you can then choose the mapping of bytes to graphic character
|sets.  This is done with LS0, LS1, LS2 and LS3 (locking Shift N)
|to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for
|the 0xA0..0xFF range.  There's also SS2 and SS3 to shift the next character
|from G2 or G3 into the 0x20..0x7F range.

|In the document I have, SS2 ix 0x19 (EM) and SS3 is 0x1D (GS).
|LS0 is 0x0F (SI), and LS1 is 0x0E (SO).  LS2 is ESC n and LS3 is
|ESC o.  LS1R is ESC ~, LS2R is ESC } and LS3R is ESC |.


|There are also multi-byte character sets, using either 94 or 96
|characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F
|for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for
|the 960-character case.

|You can have "dynamically reconfigurable character sets" (downloadable fonts),
|which are specified by inserting a space (0x20) between the character-set
|specifier and the final character.  (If 63 is not enough, overflow using
|the ! hack is a possibility.)


|Oh, and finally, you can replace everything (all 128 or 256 characters)
|with ESC % F.  What happens after that depends on the new character set,
|which may or may not define ESC to get at the old things.


|Now, what I don't understand is how 8-bit character sets work.  RFC 1345
|specifies rather a lot of them, and generally uses the 96-character escapes
|for them, but there are a few 94-character escapes specified.
|In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2
|character sets, which are 8 bits.

|I could reconcile this if the G sets had room for two banks of characters
|(low and high), and 7-bit sets loaded both identically, while 8-bit
|sets loaded them differently, and the various shift functions fetched
|from the corresponding bank.  But I can't find it referred to anywhere.

Seems that in 94-banks really are only 94-charcters and 96-banks have
only 96 characters. In case on 8-bit characters in banks have characters
161-254 (94-bank) or 160-255 (96-bank). So after what bank is selected
higest bit of char is ignored. That higgest bit affect only selection
of GR/GL. And selection of GR/GL affect is that bank G0-G3. 

But after that caharcter is indexed from bank as (char & 127) -- or this
is my impression from some documents (specially from: 
draft-ohta-text-encoding-01.txt).

Can you comfirm this?

|Anyway, I don't think I've made any suggestions or asked any questions,
|but maybe this information dump will help some other people.
|-- 
|	-Colin

[ CC'ed to colin@nyx10.cs.du.edu ]
--
- Kari E. Hurtta                             /  Elm on monimutkaista
  Kari.Hurtta@FMI.FI			     puh. (90) 1929 658
  {hurtta,root,Postmaster}@dionysos.FMI.FI

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news.alpha.net
     !news.mathworks.com!transfer.stratus.com!xylogics.com!Xylogics.COM!carlson
Date: 7 Apr 1995 12:08:07 GMT
Organization: Xylogics Incorporated
Message-ID: <3m39v7$2es@newhub.xylogics.com>
References: <784.2EDBB0B0@purplet.demon.co.uk> <aeb.786191379@news.cwi.nl>
            <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi>
            <3bjdl0$lfd@nyx10.cs.du.edu> <3m2p6h$kll@kronos.fmi.fi>
NNTP-Posting-Host: newhub.xylogics.com
From: carlson@Xylogics.COM (James Carlson)
Subject: Re: 8-bit charset in C1-C3 banks
  (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...)


In article <3m2p6h$kll@kronos.fmi.fi>,
              hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes:
[...]
|> |You can select a C0 set with ESC ! F, where F is one of the final
|> |characters discussed above, and a C1 set with ESC " F.

Do you have a reference for that?  I've never seen those described or
used.  (I'm not even sure what it would mean to have a "C0 set" ...)

|> |Thereare 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F).
|> |You can have 4 of these floating around, G0, G1, G2 and G3.  The 0x20..0x7F
|> |and 0xA0..0xFF ranges are available to have these sets mapped into them.
|> |When you see a "0x3F", for example, you have to figure out which set (G0,
|> |G1, G2 or G3) is mapped into that space,and then figure out which character
|> |set is in force there.

You left out GL and GR.  GL (Graphics Left) is the pointer which maps
the 20-7E characters into one of the Gx sets.  Thus, GL has one of the
values 0, 1, 2 or 3.  GR (Graphics Right) is the pointer for the A0-FF
set.  This is usually restricted to 1, 2 or 3 (not 0).

|> |I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in
|> |0xA0..0xFF, but I'm not finding it documented.

The default (at least for VT-series terminals) is GL=0, GR=2, G0=ascii,
G1=ascii, G2=multinational and G3=multinational.

|> |Anyway, you can then choose the mapping of bytes to graphic character
|> |sets.  This is done with LS0, LS1, LS2 and LS3 (locking Shift N)
|> |to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for
|> |the 0xA0..0xFF range.  There's also SS2 and SS3 to shift the next character
|> |from G2 or G3 into the 0x20..0x7F range.

Actually, the locking-shift operators just change the GL and GR
pointers.

---
James Carlson <carlson@xylogics.com>            Tel:  +1 617 272 8140
Annex Software Support / Xylogics, Inc.               +1 800 225 3317
53 Third Avenue / Burlington MA  01803-4491     Fax:  +1 617 272 2618

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Article 3934 of comp.terminals:
Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!atglab.bls.com!gatech!newsjunkie.ans.net!newstf01.news.aol.com!newsbf02.news.aol.com!not-for-mail
From: psichel@aol.com (PSichel)
Newsgroups: comp.terminals
Subject: Re: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...)
Date: 13 Apr 1995 11:07:03 -0400
Organization: America Online, Inc. (1-800-827-6364)
Lines: 23
Sender: root@newsbf02.news.aol.com
Message-ID: <3mjemn$5j5@newsbf02.news.aol.com>
References: <3m2p6h$kll@kronos.fmi.fi>
Reply-To: psichel@aol.com (PSichel)
NNTP-Posting-Host: newsbf02.mail.aol.com

In  Message-ID: <3m2p6h$kll@kronos.fmi.fi> you wrote:

>Now, what I don't understand is how 8-bit character sets work.

8-bit character sets that follow the ISO structure (ISO 2022)
are made up of two 7-bit "halves".  For example, ASCII in GL
and ISO Latin-1 Supplemental in GR.  The combined 8-bit set
is called "ISO Latin Alphabet Nr 1" or "ISO Latin-1" for short.
[Ignoring the control sets C0 & C1 for simplicity]

ISO 8859/1 (Latin-1) through ISO 8859/9 define additional
8-bit sets by specifying the supplemental part to be used
in GR along with ASCII in GL.

IBM Code Pages are different in that they have no structure
for designating and invoking (switching) character sets or components.
Each code page defines a fixed application specific repertiore.

The term "code page" refers to the page number on which the
character set is described in IBM's master book of character
encodings.

- Peter


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Article 2632 of comp.protocols.kermit.misc:
Newsgroups: comp.protocols.kermit.misc
Path: cs.utk.edu!cssun.mathcs.emory.edu!hobbes.cc.uga.edu!news-feed-1.peachnet.edu!news.netins.net!newshost.marcam.com!uunet!psinntp!nntp.hk.super.net!news.ust.hk!apang
From: apang@cs.ust.hk (Albert PANG)
Subject: How To read/write Chinese at a remote host using UNIX C-Kermit
Message-ID: <1995Apr24.142214.28377@uxmail.ust.hk>
Sender: usenet@uxmail.ust.hk
Nntp-Posting-Host: cssu81.cs.ust.hk
Organization: The Hong Kong University of Science and Technology
X-Newsreader: TIN [version 1.2 PL2]
Date: Mon, 24 Apr 1995 14:22:14 GMT
Lines: 84


How to read/write Chinese at a remote host using UNIX C-Kermit
==============================================================

Software required:
-----------------
1) cxterm
2) kermit

'cxterm' is available at anonymous ftp 

 	ftp://cs.purdue.edu:/pub/ygz/cxterm-??.??.??.tar.Z

Linuxers can also get a binary version on Linux at

	ftp://sunsite.unc.edu:/pub/Linux/X11/xutils/terms/cxterm-??.tar.gz

C-Kermit 5A for your version of UNIX is available from

	ftp://kermit.columbia.edu/kermit/archives/cku190.tar.{Z,gz}

Setup procedure:
----------------
1. Make sure you have cxterm properly installed
   and can display/write Chinese characters in your local host.
   To get cxterm properly installed, the FAQ for cxterm, which is available
   at anonymous ftp:

	cs.purdue.edu:/pub/ygz/CXTERM.FAQ

   will be helpful. 

   There are currently a few encoding methods for Chinese characters.
   They are Big5, GB and HZ. In HK and Taiwan, Big5 is more popular
   and in Mainland China, GB and HZ are more popular. 'cxterm' can
   be configured to support all of them. Anyway, this will not be
   relevant to kermit, as long as they are 8-bit code. 'cxterm'
   configured to a particular encoding will recognize that encoding only.

2. Open a cxterm and run kermit.

3. Configure kermit. Before you connect your modem to kermit, you need
   some parameter settings:
	
	set parity none
	set command bytesize 8
	set terminal bytesize 8
	set terminal character-set transparent

   Then connect as usual and log in to your remote host.

4. At your remote host, set the terminal to allow 8-bit character by

	UNIX-Prompt> stty pass8

   This example works on SunOS, but the syntax might differ for other UNIX
   systems, for example "stty cs8" or "stty -parity".  On non-UNIX systems
   use the appropriate command (like "set terminal /eightbit" on VMS).

   If you don't do this, you can still read Chinese, but you can't type,
   since your terminal will truncate the highest bit of your code. (unless
   of course, your terminal has already been configured)
   You might like to include the above line in your shell rc script,
   so that you won't have to type it in every time you log in.

5. Voila! You should now be able to read/write Chinese
   in your cxterm. Go get a cup of tea or something and try read some 
   Chinese newsgroups.

	alt.chinese.txt.big5
	alt.chinese.txt
	tw.bbs.talk.joke
 
   Make sure you have the right kind of cxterm. cxterm configured
   to read Big5 will not recognize a passage written in GB, and vice
   versa. 

And for information about how to read/write Chinese using MS-DOS Kermit,
see "Circumnavigating the Web" in Kermit News #6:

  	ftp://kermit.columbia.edu/kermit/e/newsn6.{txt,ps}
	http://www.columbia.edu/kermit/newsn6.html

-- 
Albert Pang
<apang@cs.ust.hk>

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


				Kterm Announcement
				Sat May  4 14:11:37 1991

				Internet: mleisher@nmsu.edu
				Bitnet  : mleisher@nmsu.bitnet

				Mark Leisher
				Computing Research Lab
				New Mexico State University
                                Box 3CRL
				Las Cruces, NM 88001-0001
				+1 505 646-5711

INTRODUCTION
------------
Kterm is a modified version of xterm that is capable of displaying
text from character sets requiring 2-bytes per character as well as
the standard single byte character sets.  The original kterm was
designed to support display of Japanese text.  This capability has
been expanded to include Chinese and Korean as well.

CHARACTER SETS AND CODINGS
--------------------------
Version 4.1.2 of kterm can display Chinese, Japanese, and Korean text
in a number of coding systems.  With the exception of the Korean
N-byte coding, all of the coding systems described below require two
bytes per character.

   1. Chinese

      A. GB2312-1980 (GuoBiao) PRC standard
         GB is a seven bit standard that requires two bytes per
         character.  It is most often used with the high (most
         significant) bit set on each byte of the character to
         distinguish the Chinese text from other seven bit text.  The
         eight bit usage of GB is also used in CCDOS, the Chinese
         version of MS-DOS.
         NOTE: Perhaps the eight bit usage should be refered to as
               EUC (Extended Unix Code).
         CODE RANGE: 0xA1A1-0xFEFE

      B. Shift-GB
         Shift-GB is a mixed seven and eight bit coding, with the
         first byte always having the high (most significant) bit set
         to distinguish it from other seven bit text.  Shift-GB was
         used by the Chinese Macintosh OS until recently.
         NOTE: I'm not sure if it is an official standard.
         CODE RANGE: 0x8140-0xAFFC (excluding 0x7F as a second byte)

      C. Big5
         Big5 is a mixed seven and eight bit coding, with the first
         byte always having the high (most significant) bit set to
         distinguish it from the other seven bit text.  Big5 is at
         least a de facto standard in places like Hong Kong and Taiwan
         where the Traditional Chinese ideographs are used.
         NOTE: Rumor has it that it is, or will be a standard in
               Taiwan.  I don't have any facts on this yet.
         CODE RANGE: 0xA140-0xF9FE

   2. Japanese
      A. JIS (Japanese Industrial Standard X0208-1983)
         JIS is a seven bit standard that is usually distinguished
         from other seven bit text by a starting and ending escape
         sequence.
         START ESCAPE SEQUENCE: <ESC>$B (NEW-JIS) <ESC>@B (OLD-JIS)
         END ESCAPE SEQUENCE  : <ESC>(B
         CODE RANGE: 0x2121-0x7E7E
      B. Shift-JIS
         Shift-JIS is a mixed seven and eight bit coding, with the
         high (most significant) bit of the first byte set to
         distinguish it from the other seven bit text.
         CODE RANGE:
           FIRST BYTE : 0x81-0x9F and 0xE0-0xEF
           SECOND BYTE: 0x40-0xFC (excluding 0x7F)
      C. EUC
         EUC is an eight bit usage of JIS, with the high (most
         significant) bit of each byte set to distinguish it from
         other seven bit text.
         CODE RANGE: 0xA1A1-0xFEFE

   3. Korean

      A. KSC5601-1987 (Jamos and Hangul)
         This version of kterm only supports the Jamos (Hangul
         elements) and Hangul portion of the KSC5601-1987 standard.
         The Hanja portion will come later.
         KS is a seven bit standard that requires two bytes per
         Hangul character.  It is most often used with the high (most
         significant) bit set on each byte of the character to
         distinguish the Korean text from other seven bit text.
         NOTE: Perhaps the eight bit usage should be refered to as
               EUC (Extended Unix Code).
         CODE RANGE:
           JAMOS : 0xA4A1-0xA4FE
           HANGUL: 0xB0A1-0xC8FE

      B. N-byte
         N-byte code is a way of representing Hangul text using only
         ASCII characters.  It uses a variable number of bytes to
         select a particular Hangul syllable and is distinguished from
         other seven bit text by the SO (Shift Out) sequence and the SI
         (Shift In) sequence.
         START ESCAPE SEQUENCE: ^N (0x0E)
         END ESCAPE SEQUENCE  : ^O (0x0F)
         CODE RANGE: 0x41-0x7C (full range)
         NOTE: The code range actually varies.  See the file
               "hgutil.c" for details.
         
   4. X11 Compound Text
      Version 4.1.2 of kterm now recognizes most of the Compound Text
      approved standard encodings.  It does not recognize the
      non-standard character set encodings or the directionality
      indicators.  Even though the approved standard encodings are
      recognized, this is no guarantee that they will display text
      appropriately, specifically the right-to-left encodings.  Code
      will have to be added to support this.

      The 94^N Compound Text sequences for GB 2312-1980, JIS
      X0208-1983, and KS C5601-1987 will be interpreted correctly if
      the appropriate language is chosen when starting kterm, or if it
      is set in the application defaults file, KTerm.ad.

FONTS
-----
There are a number of freely available Chinese, Japanese and Korean
X11 fonts available.  Here are some anonymous ftp sites where the
fonts are available:

1. HOST: crl.nmsu.edu [128.123.1.14]
   CRL has a relatively complete collection of the freely available
   Chinese, Japanese, and Korean X11 fonts.  They are located in the
   subdirectories pub/chinese/fonts, pub/japanese/fonts, and
   pub/korean/.  The CRL site also has lists of known anonymous ftp
   sites for software related to the language of interest.

2. HOST: miki.cs.titech.ac.jp [131.112.16.39]
   HOST: utsun.s.u-tokyo.ac.jp [133.11.11.11]
   These ftp sites have large collections of many Usenet and JUNET
   newsgroup archives.  The fj.sources archives contain many of the
   Japanese X11 fonts that have been posted on JUNET.  There are Index
   files in most of the directories describing which archive file has
   the font sources.

3. HOST: kum.kaist.ac.kr [137.68.1.65]
   There are a few Korean utilities available from this site as well
   as archives of a number of Usenet news groups.  Most of the Korean
   related code and fonts are located in pub/hangul/.

AUTHORS AND CONTRIBUTORS
------------------------
The initial conversion work on xterm for displaying Japanese text was
done by kagotani@cs.titech.ac.jp (Hiroto Kagotani).

The ANSI color support was added using the kterm 4.1.0 patches
provided by mukawa@tn-sec.ntt.junet (Susumu Mukawa).

The Multi-Byte Character Set Word Select feature was added using a
modified version of Kiyoshi KANAZAWA's 4.1.0 MBCS_WSEL patches.

The Chinese and Korean support was added by
mleisher@nmsu.edu (Mark Leisher).

CLOSING NOTES
-------------
The {character set,font set,language,conversion} mechanisms are a little
clumsy and should eventually be modified to be more in line with XPG3
locale specifications and the up-coming X11 i18n specifications.
Hopefully, this won't be too far away.

BUG REPORTS
-----------
Please send bug reports and/or fixes for kterm 4.1.2 to
mleisher@nmsu.edu or mleisher@nmsu.bitnet.

THANKS
------
I would like to express my thanks to Mr. Kagotani for doing the
initial conversion work.  His code made it a lot easier for me to add
support for Chinese and Korean.

Thanks go to Ricky Yeung and F. F. Lee for making their Chinese code
conversion programs freely available.

I would also like to thank ujsung@solgai.kaist.ac.kr (UnJae Sung) for
having the patience to answer my questions about Korean coding.

And last but not least, thanks go to these people for significant bug
reports and fixes:

  John Melby of Fujitsu

  Martin C. Fong of Sybase

  Yang Zhiwei of the German National Research Center for Computer
  Science

  Alton Harkcom (for help updating the Japanese manual page)

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu
      !news-peer.sprintlink.net!news.sprintlink.net!Sprint!newsfeed.nacamar.de
      !wuff.mayn.de!wuff.franken.de!news-nue1.dfn.de!news-mue1.dfn.de
      !rzg.mpg.de!lrz-muenchen.de!not-for-mail
Newsgroups: comp.unix.questions,comp.unix.admin,comp.windows.x,
            comp.std.internat,comp.software.international,at.general,
            soc.culture.german,soc.culture.french,soc.culture.belgium,
            soc.culture.quebec,soc.culture.nordic,soc.culture.spain,
            soc.culture.portuguese,soc.culture.latin-american,
            soc.culture.brazil,soc.culture.argentina,soc.culture.mexico,
            soc.culture.italian,soc.culture.colombia,soc.culture.venezuela,
            soc.culture.peru,soc.culture.chile,bit.listserv.catala
Distribution: world
References: <internationalization/iso-8859-1-charset_868356133@rtfm.mit.edu>
Message-ID: <5r028v$4fn$1@sparcserver.lrz-muenchen.de>
Organization: Leibniz-Rechenzentrum, Muenchen (Germany)
Date: 21 Jul 1997 16:20:47 GMT
From: Helmut.Richter@lrz-muenchen.de (Helmut Richter)
Subject: Re: ISO 8859-1 National Character Set FAQ

mike@vlsivie.tuwien.ac.at writes:

>*****If you can confirm or deny this, please let me know.*****
>Currently, each system vendor has his own set of locale names, which
>makes portability a bit problematic.  Supposedly there is some X/Open
>document specifying a

>       <language>_<country>.<character_encoding>

>syntax for environment variables specifying a locale, but I'm unable
>to confirm this.

POSIX 1003.1 recommends (in the informative annex E.1.3) to use the
following syntax of locale names: language_TERRITORY.Code, e.g.:

  de_AT.ISO8859-1
  hu_HU.ISO8859-2
  ja_JP.AJEC

The funny thing is that they use a different syntax in the example in
section B.8.1.2 (also an informative annex).

====

I think one should add some info on redefining a keyboard under X11 as
to include additional characters. I have written a lengthy paper on
the topic, albeit in German language
(http://www.lrz-muenchen.de/services/software/x11/xmodmap/).  I am
ready to translate a part of it into English, but certainly not all of it.

This is also interesting for emacs under X11: emacs does make a
difference between a key combination like Meta-d and a key combination
that has been redefined to mean a non-ASCII character (of course you
must not use the Meta key, which is typically the same as the Alt key,
as Mode_switch key). It is thus not necessary to quote such characters
with Ctrl-Q to prevent them from being taken for emacs commands.

Helmut Richter


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.os.linux.development.apps,
            comp.os.linux.development.system,comp.terminals
Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net
      !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi
      !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta
Organization: Finnish Meteorological Institute (FMI)
Lines: 53
Message-ID: <3pn802$sc1@kronos.fmi.fi>
References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu>
In-Reply-To: Article <3p9gne$mu7@uahcs2.cs.uah.edu> of Chris Ford
NNTP-Posting-Host: dionysos.fmi.fi
Date: 21 May 1995 11:25:54 GMT
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Character set assigments (Re: How do I display IBM PC characters?)


[ Added comp.os.linux.developments.system as receiver because terminal driver
  is part of kernel -- right? Added comp.terminals as receiver, because that
  is terminal (or terminal emulation) issue. ]

cford@laser01.cs.uah.edu (Chris Ford)
 writes in comp.os.linux.developments.apps:
|
|Peter Koenig (koenig@interaccess) wrote:
|: I'm trying to figure out how to display IBM PC characters.  I know it's 
|: possible, but doing a simple printf() with the value gets it masked to 
|: 7-bits, and when I tried ncurses, it put the wrong character up...  Any 
|: pointers to more info on this?

|	Before you do your printf, print this: "\033(U" and it will switch
|to the DOS character set.  "\033(B" will switch back.  Or vice versa.


Just a comment (and some surprising notes :-))

These ESC ( U is quite odd code in standards view as
far I understands.  ESC ( assigns bank G0. And bank G0 is on accessible
in range 128-255. That is GR (right side; characters (128)160-255) can newer
point to to bank G0. Only to banks G1-G3.

It should be more understandable if code is 
	ESC - A		Assing Latin/1 (area (128)160-255) to G1
	ESC - U		Assign DOS character set (area 128-255) to G1

But it isn't that way :-)

And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another
standard meaning (see later).


(both ESC - and ESC ) assigns G1 -- charset names are different. 
 Hmm. ESC - can assign areas 160-255 (32-127), 
 ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.)

By to way -- from where that ident "U" comes for DOS character set?
Just curious. 

Oops. Letter "U" is reserved for Latin-greek-1 (iso-ir-27) according
of RFC 1345 (that is informal RFC).

RFC 1345 lists following codes:
	ESC ( U		Assigns iso-ir-27 to G0
	ESC ) U		Assigns iso-ir-27 to G1
	ESC * U		Assigns iso-ir-27 to G2
	ESC + U		Assigns iso-ir-27 to G3

RFC 1345 don't list codes
	ESC - U		Assign {something} (160-255 (32-127)) to G1
	ESC . U		Assign {something} (160-255 (32-127)) to G2
	ESC / U		Assign {something} (160-255 (32-127)) to G3
	
[ Hmm. Perhaps I comment some other issues later. ]



 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.os.linux.development.system,
            comp.os.linux.development,,comp.terminals
Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net
      !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi
      !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta
Message-ID: <3pnf1l$nc@kronos.fmi.fi>
In-Reply-To: Article <3pn802$sc1@kronos.fmi.fi> of "Kari E. Hurtta"
References: <3ok74b$5en@nntp.interaccess.com>
            <3p9gne$mu7@uahcs2.cs.uah.edu> <3pn802$sc1@kronos.fmi.fi>
NNTP-Posting-Host: dionysos.fmi.fi
Organization: Finnish Meteorological Institute (FMI)
Date: 21 May 1995 13:26:13 GMT
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Re: Character set assigments (Re: How do I display IBM PC characters?)

hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes:

[ I promised to followup myself :-) ]

|[ Added comp.os.linux.developments.system as receiver because terminal driver
|  is part of kernel -- right? Added comp.terminals as receiver, because that
|  is terminal (or terminal emulation) issue. ]

[ Dropped comp.os.linux.development.apps from receivers. 
  Added comp.os.linux.development as receiver :-) ]

|cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps:
||	Before you do your printf, print this: "\033(U" and it will switch
||to the DOS character set.  "\033(B" will switch back.  Or vice versa.

|These ESC ( U is guite odd code in standards view as
|far I understands.  ESC ( assigns bank G0. And bank G0 is on accessible
|in range 128-255. That is GR (right side; characters (128)160-255) can newer
|point to to bank G0. Only to banks G1-G3.

|It should be more understandable if code is 
|	ESC - A		Assing Latin/1 (area (128)160-255) to G1
|	ESC - U		Assign DOS character set (area 128-255) to G1

Because you want keep DEC special graphics in G1 (which is default for VT100),
and GR is bydefault pointed to bank G2. Better use following codes:

	ESC . A		Assign Latin/1 (area 160-255) to G2
	ESC . U		Assign DOS character set (area 160-255(*)) to G2


(*) There is still problem that C1 (128-159) is for control codes.
    At least some versions of Linux terminal driver interpreter one
    of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter
    all codes in C1 range or nothing them -- current situation confusing.
    Notice specially cursor control codes: IND (8/4 or 0x84), 
    RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) 

|But it isn't that way :-)

|And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another
|standard meaning (see later).

|(both ESC - and ESC ) assigns G1 -- charset names are different. 
| Hmm. ESC - can assign areas 160-255 (32-127), 
| ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.)

<...>

|RFC 1345 lists following codes:
|	ESC ( U		Assigns iso-ir-27 to G0
|	ESC ) U		Assigns iso-ir-27 to G1
|	ESC * U		Assigns iso-ir-27 to G2
|	ESC + U		Assigns iso-ir-27 to G3

|RFC 1345 don't list codes
|	ESC - U		Assign {something} (160-255 (32-127)) to G1
|	ESC . U		Assign {something} (160-255 (32-127)) to G2
|	ESC / U		Assign {something} (160-255 (32-127)) to G3

RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give
character set assigment codes for this.

|[ Hmm. Perhaps I comment some other issues later. ]
[ I still seems to be some issue not to be covered yet. :-) ]



 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.os.linux.development.system,
            comp.os.linux.development,comp.terminals
Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net!demon
      !doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi
      !kronos.fmi.fi!dionysos.fmi.fi!hurtta
Message-ID: <3pp9va$8je@kronos.fmi.fi>
References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu>
            <3pn802$sc1@kronos.fmi.fi> <3pnf1l$nc@kronos.fmi.fi>
In-Reply-To: Article <3pnf1l$nc@kronos.fmi.fi> of "Kari E. Hurtta"
Organization: Finnish Meteorological Institute (FMI)
Date: 22 May 1995 06:11:54 GMT
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Re: Character set assigments (Re: How do I display IBM PC characters?)

hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes:

[ I'm still followuping myself :-) ]

||cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps:
|||	Before you do your printf, print this: "\033(U" and it will switch
|||to the DOS character set.  "\033(B" will switch back.  Or vice versa.

<...>

||It should be more understandable if code is 
||	ESC - A		Assing Latin/1 (area (128)160-255) to G1
||	ESC - U		Assign DOS character set (area 128-255) to G1

	{ ESC - U is just my suggestion, only prefix ESC - is standard }

|Because you want keep DEC special graphics in G1 (which is default for VT100),
|and GR is bydefault pointed to bank G2. Better use following codes:

|	ESC . A		Assign Latin/1 (area 160-255) to G2
|	ESC . U		Assign DOS character set (area 160-255(*)) to G2

	{ ESC . U is just my suggestion, only prefix ESC . is standard }

|(*) There is still problem that C1 (128-159) is for control codes.
|    At least some versions of Linux terminal driver interpreter one
|    of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter
|    all codes in C1 range or nothing them -- current situation confusing.
    Notice specially cursor control codes: IND (8/4 or 0x84), 
|    RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) 

In article "Re: DO use ESC [ 11 m (was: Don't use ESC 11 m"...
in groups comp.os.linux.development and comp.terminals 
Colin Plumb <colin@nyx10.cs.du.edu> (at 30 Nov 1994 19:50:08 -0700)
was giving information what indicates that perhaps correct prefix is
	ESC %
which changes whole set (all 128 or 255 characters). So perhaps yeat better
codes are something like:
	ESC % A		Assigns Latin/1 to G2,
				enables C1 (128-159) as control range,
				Assigns US-ASCII to G0
	ESC % U		Assigns MS-DOS to range 32-255 (G0,G2 and C1),
				disables C1 as control range
	{ previous codes are just my suggestions, not from many specification.
          Only prefix ESC % can be taken from ISO 6429 }

Hmm. According same article prefix ESC ! can be used for assign C0 (0-31)
and prefix ESC " can be used to assign C1 (128-159).

	By to way, what codes was to assign UTF-8 and UTF-1
	Was it ESC % {something}		
		
	I think that I have hear code for UTF-1 to be assigned officially.

<...>

||RFC 1345 lists following codes:
||	ESC ( U		Assigns iso-ir-27 to G0
||	ESC ) U		Assigns iso-ir-27 to G1
||	ESC * U		Assigns iso-ir-27 to G2
||	ESC + U		Assigns iso-ir-27 to G3

||RFC 1345 don't list codes
||	ESC - U		Assign {something} (160-255 (32-127)) to G1
||	ESC . U		Assign {something} (160-255 (32-127)) to G2
||	ESC / U		Assign {something} (160-255 (32-127)) to G3

|RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give
|character set assigment codes for this.

||[ Hmm. Perhaps I comment some other issues later. ]
|[ I still seems to be some issue not to be covered yet. :-) ]

[ Perhaps I not followup myself -- I think that is going to be monology :-) ]

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.os.linux.development,comp.terminals
Path: cs.utk.edu!gatech!swrinde!pipex!sunic!news.tele.fi!news.csc.fi
      !kronos.fmi.fi!dionysos.fmi.fi!hurtta
Message-ID: <3bjv6b$mf4@kronos.fmi.fi>
References: <784.2EDBB0B0@purplet.demon.co.uk> <aeb.786191379@news.cwi.nl> 
            <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> 
            <3bjdl0$lfd@nyx10.cs.du.edu>
Organization: Finnish Meteorological Institute (FMI)
Date: 1 Dec 1994 07:49:31 GMT
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m  was: Re: using the V


colin@nyx10.cs.du.edu (Colin Plumb) writes:
> It seems that the standard escape sequence looks like:
> CSI P P P ... P I...I F

> Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?)
> I are magic modifier flags that can totally change the meaning of the escape
> sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./)
> And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies
> what the escape sequence is all about.

Thanks.

Yes. I was little careless. For character set changing DEC uses I modifiers and
that F final letters. 

> There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F).
> You can have 4 of these floating around, G0, G1, G2 and G3.  The 0x20..0x7F
> and 0xA0..0xFF ranges are available to have these sets mapped into them.
> When you see a "0x3F", for example, you have to figure out which set (G0,
> G1, G2 or G3) is mapped into that space, and then figure out which character
> set is in force there.

> It's a bit like the 4 segment registers on the 8086.

> 94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F.
> These are the G0..G3 slots, respectively.  There's also an overflow range
> which is used, ESC ( ! F, etc.

94 -character sets seems to be (in VT420):
	B			US-ASCII
	%5			DEC Multinational	

Following character sets haven't mentioned are they 94 or 96 character set
-- I think that these are 94 -character sets:
        0                       DEC special graphics    
        >                       DEC Technical  
        <                       user-preferred supplemental   (*) 

And also following national character sets (available only in national mode):
        A			UK-ASCII	(ISO United Kingdom)
	4			DEC Dutch
	5			DEC Finnish
	R			ISO French
	9			DEC French Canadian
	K			ISO German
	Y			ISO Italian
	6			DEC Norwegian/Danish
	'			ISO Norwegian/Danish
	%6			DEC Portuguese
	Z			ISO Spanish
	=			DEC Swiss

(*) DEC Multinational or ISO Latin/1 (selectable with DCS ... ST codes).

> 96-character sets can only be mapped to the G1..G3 slots.  That uses
> ESC - F, ESC . F and ESC / F.  The "F" assignments are independent of
> the assignments for the 94-character sets.

96 -character sets seems to be (in VT420):
	A			ISO Latin/1

> I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in
> 0xA0..0xFF, but I'm not finding it documented.

That is how VTxxx -series terminals does it.

> There are also multi-byte character sets, using either 94 or 96
> characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F
> for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for
> the 960-character case.

You mean: ... and ESC $ / F for the 96-character case.

> Now, what I don't understand is how 8-bit character sets work.  RFC 1345
> specifies rather a lot of them, and generally uses the 96-character escapes
> for them, but there are a few 94-character escapes specified.
> In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2
> character sets, which are 8 bits.

> I could reconcile this if the G sets had room for two banks of characters
> (low and high), and 7-bit sets loaded both identically, while 8-bit
> sets loaded them differently, and the various shift functions fetched
> from the corresponding bank.  But I can't find it referred to anywhere.

At least codes ESC ) < 
	       ESC * <
               ESC + <
	       ESC ) %5
               ESC * %5
               ESC + %5

changes both low and high side of banks (I think that I don't have used
other codes for selecting 8-bit character sets.) I don't have tried use
high side of bank when to bank have assigned 7-bit character set.

> Anyway, I don't think I've made any suggestions or asked any questions,
> but maybe this information dump will help some other people.
--
- Kari E. Hurtta                             /  Elm on monimutkaista
  Kari.Hurtta@Fmi.FI			     puh. (90) 1929 658
  {hurtta,root,Postmaster}@dionysos.fmi.fi


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.unix.admin,comp.terminals
Path: cs.utk.edu!gatech!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu
      !math.ohio-state.edu!cs.utexas.edu!convex!cnn.exu.ericsson.se
      !erinews.ericsson.se!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi
      !kronos.fmi.fi!dionysos.fmi.fi!hurtta
Message-ID: <3s15pj$4cs@kronos.fmi.fi>
References: <3rli9f$3qd@linet02.li.net>
            <steven.g.johnson.1-1406950720450001@taygeta.gsfc.nasa.gov>
NNTP-Posting-Host: dionysos.fmi.fi
Organization: Finnish Meteorological Institute (FMI)
Date: 18 Jun 1995 12:22:11 GMT
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Subject: Re: extended ascii characters

[ Added comp.terminals as receiver. ]

steven.g.johnson.1@gsfc.nasa.gov (steve johnson) writes in comp.unix.admin:
|In article <3rli9f$3qd@linet02.li.net>, cagenjo@scls1 (Agenjo) wrote:
|>
|> I am part of a team setting up 54 libraries on an Internet system.  I 
|> designed a welcome screen (using a DOS text editor) that I hoped would 
|> greet new users.  I used some extended ascii characters to create a nice 
|> graphic, but when our sysadmin loaded it in, the characters we see upon 
|> login are not what I used - they have become numbers, etc.
|
| unfortunately, different systems map differently.
|
|> He doesn't 
|> think there is a way for his UNIX SunOS to properly display my file.  
|> Does anyone know of a way to do this?

| i'm no expert on this, but what you probably want is one of the isolatin
| (iso8859) character sets.  ascii is a proper subset of iso8859-1.

[ My answer is partially terminal specific and partially uses
  document "ISO International Register of Coded Character Sets To Be Used
  With Escape Sequences". Sorry. ]

For drawboxes ('nice graphics') he probably want play special graphics sets 
such as what is in VT100.
	ie -- Assign special graphic set to back G1
		ESC ( 0					ESC is 0033 in octal
	   -- select bank G1 for characters 32-127
		SO					SO is 0016 in octal
	   -- For boxes you can now use character
		upper left corner:  0154 in octal, 0x6C in hex
		upper right corner: 0153 in octal, 0x6B in hex
		lower left corner:  0155 in octal, 0x6D in hex
                lower right corner: 0152 in octal, 0x6A in hex
		horizontal line:    0161 in octal, 0x71 in hex
			(characters 0157 - 0163 have horizontal lines)
		vertical line:      0170 in octal, 0x78 in hex

	   -- To return US-ASCII, selext bank G0 for characters 32-127
		SI					SI is 0017 in octal
	(This assumes that in G0 have US-ASCII, if it don't include
	 US-ASCII, you can assign it with
		ESC ( B					ESC is 0033 in octal)

	That Special graphics set is DEC -specific, but for example
	(in theory) xterm also supports it.

To assign Latin/1 you need VT300 or better:
	-- First assign US-ASCII to bank G0
		ESC ( B					ESC is 0033 in octal
	-- Select bank G0 for characters 32-127
		SI					SI is 0017 in octal
	-- Assign Latin/1 range 160-255 to bank G2
		ESC . A					ESC is 0033 in octal
	-- Select bank G2 for characters 160-255
		ESC }					ESC is 0033 in octal
	* Now you have Latin/1 available

	- If you have shortage of banks and you don't want use
	  special graphich in bank G1, you can assign Latin/1
	  range 160-255 to bank G1
		ESC - A					ESC is 0033 in octal
	  and select bank G1 for characters 160-255
		ESC ~					ESC is 0033 in octal



 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.terminals
Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!mp.cs.niu.edu
      !vixen.cso.uiuc.edu!howland.reston.ans.net!spool.mu.edu
      !bloom-beacon.mit.edu!crl.dec.com!crl.dec.com!nntpd.lkg.dec.com
       !regent.enet.dec.com!lasko
From: lasko@regent.enet.dec.com
X-From: (Tim Lasko, Digital Equipment Corp., Marlborough, MA)
Subject: Re: Hebrew keyboard mapping
Date: 6 JUL 95 13:54:55
Organization: Digital Equipment Corporation
Message-ID: <3th8pc$dl@nntpd.lkg.dec.com>
References: <3tfc1c$qvl@senator-bedfellow.MIT.EDU>

In article <3tfc1c$qvl@senator-bedfellow.MIT.EDU>,
 igorlord@mit.edu (Igor Lyubashevskiy) writes...
>
>Hi, I am reading my VT420 manual, and it is totally clueless about the control
>sequences that envolve Hebrew modes.... Does anyone at DEC or otherwise know 
>the correct values that go into those sequences (  CSI ? Pd h    - like ).
>Also, what are the mode identifiers of DECHEM (Hebrew encoding mode) and 
>DECNAKB (Greek Keyboard Mapping) since they are also mentioned to be either
> 34, 35, or 57 in the description, index, or examples.  

There are actually four commands:

DECRLM  - Cursor Right to Left Mode      ?34    
DECHEBM - Hebrew (Keyboard) Mode         ?35
DECHEM  - Hebrew Encoding Mode           ?36
DECNAKB - North American Keyboard Mode   ?57

I'm looking at my VT5xx programming manuals (avaliable from Digital's ftp site)
and I still see a few typos, unfortunately.

>Finally, what is the function of
>DECNAKB and DECHEBM (two very similar functions) when SET?  The manual claims 
>that they function in an exactly opposite way to each other, which seems to me
>highly illigical.

They operate exactly as described. DECHEBM when reset and DECNAKB when set
configure the terminal to use the North American keyboard layout. When DECHEBM
is set and DECNAKB is reset, the corresponding "non North American" layout is
configured.

[Back when "specials" of the VT200 series terminals were done, commands to
effect the similar operations (switching from a North American to a "non North
American" keyboard for one) weren't always well rationalized with each other
and these two got switched around. When those features were brought into the
base VT400 unit, the definitions were kept the way they were for backwards
compatibility with those units.]

-------------------------------------------------------------------------------
Tim Lasko, Digital Equipment Corp., Marlborough MA (lasko@regent.enet.dec.com)
My opinions are my own; the facts can speak for themselves. I'm on my own time.
For Digital terminal support: call 1.800.777.4343 or
     email <terminals@digital.com>


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.mail.mime,comp.terminals,comp.software.international
Path: cs.utk.edu!willis.cis.uab.edu!gatech!news.mathworks.com
      !newsfeed.internetmci.com!news.sprintlink.net!in2.uu.net!news.tele.fi
      !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Date: 31 Aug 1995 06:28:25 GMT
Organization: Finnish Meteorological Institute (FMI)
Message-ID: <423kq9$7e5@kronos.fmi.fi>
Subject: Re: Security and MIME (Especially, metamail)

[ Added comp.terminals and comp.software.international as receiver. ]

NED@innosoft.com (Ned Freed) writes in comp.mail.mime:
in article <01HUOLFW583090MTNI@INNOSOFT.COM>
|
|<...>
|Designers of user agents (and as you say this is not limited to MIME agents or
|even mail user agents) are caught between a rock and hard place on this issue.
|On the one hand, escape sequences are often used in text objects and if you
|block them the text ends up looking like garbage. This is especially burden-
|some to users of Japanese, Chinese, and Korean character sets that employ
|escape sequence switching -- block the switching sequences and the result is
|completely useless. And on the other hand, not blocking such sequences opens
|the door to these kinds of attacks. And by the way, they aren't limited to
|programmable keys -- programmable answerback sequences can also be used and
|are a lot more common on older, poorly designed equipment.
|<...>

It is quite easy to allow only sequences what have _syntaxticallly_ correct
according of ISO 2022. Switching sequences of Japanese, Chinese, and Korean 
character sets uses ISO 2022 codes (as far I know, I haven't read specs
of everyone -- only some).  And answerback codes and such a like 
don't match syntaxtically to these codes.

Notice that matching of syntaxtically don't require to be list of
all possible codes.


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.protocols.kermit.misc
Path: cs.utk.edu!news.msfc.nasa.gov!newsfeed.internetmci.com!chi-news.cic.net
      !newsjunkie.ans.net!news.rmii.com!thoth.nilenet.com!ra.nilenet.com!gweisz
From: gweisz@nilenet.com (Gideon Weisz)
Subject: Hebrew e-mail, etc
Date: 21 Dec 1995 04:28:02 GMT
Organization: NileNet, Ltd
Lines: 29
Message-ID: <4banoi$5k0@thoth.nilenet.com>

For those who wish to do Hebrew e-mail, and already have a DOS PC
and a UNIX internet node, things are now pretty easy, particularly
if you have mskermit 3.14.

We are even hoping that there will be a Hebrew mailing list soon. and
with mskermit you can even compose hebrew messages in the recent
English PINE easily, with the help of some scripts: kermit enables you
to write in Hebrew characters and see them on your screen going the
right way, while the scripts enable you to reverse their actual
direction and right justify afterwards.

some helpful files have been posted and are available on jerusalem1.
e-brew.txt is a cookbook style info file
e-brew.zip and its complementary ebrewadd.zip are a quickstart program
package that can also serve as a convenient toolkit, and a later
program and script package that improves it.

the e-brew files are at
ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.txt
ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.zip
ftp://ftp.jer1.co.il/pub/support/offline_mail/ebrewadd.zip

the locations might change, but that's where the files are now.
i don't want to use up bandwidth here, so anyone interested should
contact me for a copy of the full announcement or anything else
that i might be able to help with.

gideon
--
gideon weisz                                            
[boulder, colorado]


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.fonts,comp.std.internat
Path: cs.utk.edu!gatech!news.mathworks.com!fu-berlin.de!news.belwue.de
      !news.uni-konstanz.de!Otto.Stolz
Date: 9 Jan 96 11:39:09 GMT
Organization: Universitaet Konstanz
From: Otto Stolz <Otto.Stolz@Uni-Konstanz.de>
To: kirshenbaum@hpl.hp.com
References: <499ccn$102o@info4.rus.uni-stuttgart.de> <DJ3zCr.DKF.0.-s@cs.vu.nl>
    <819044957snz@sahaja.demon.co.uk> <DKLJtx.7y1@ukpsshp1.serigate.philips.nl>
     <DKoI2K.1Ip@hplabsz.hpl.hp.com>
X-URL: news:DKoI2K.1Ip@hplabsz.hpl.hp.com
Message-ID: <30f253dd.0@news.uni-konstanz.de>
Lines: 16
Subject: Re: Euro Currency Symbol (was: What does the "forin" char stand for?)


Christopher Fynn (cfynn@sahaja.demon.co.uk) wrote:
> Does anyone know if a new currency symbol for this monetary
> unit has been decided upon? 

Stephen Baynes <baynes@ukpsshp1.serigate.philips.nl> wrote:
> All the teletext standards [...] give [...] as the European
> Currency symbol [...] a glyph of a combined C and E 

evan@hpl.hp.com (Evan Kirshenbaum) wrote:
> it is the one given in the
> Unicode standard as character U+2040, "EURO-CURRENCY SIGN"

In my copy of ISO/IEC 10646-1: 1993(E), this is character number 20A0; position
2040 is assigned to the CHARACTER TIE. Unicode, most probably, complies with
ISO 10646-1, in this respect.

Regards,
          Otto Stolz

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.protocols.kermit.misc
Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!cs.utexas.edu
      !howland.reston.ans.net!gatech!newsfeed.internetmci.com!panix
      !news.columbia.edu!watsun.cc.columbia.edu!fdc
From: fdc@watsun.cc.columbia.edu (Frank da Cruz)
Date: 2 Apr 1996 15:22:20 GMT
Organization: Columbia University
Lines: 45
Message-ID: <4jrgnc$l49@apakabar.cc.columbia.edu>
References: <ZIPPY.96Mar8224419@hairball.ecst.csuchico.edu>
	<ZIPPY.96Mar3042810@hairball.ecst.csuchico.edu>
	<4hf5q8$fkj@apakabar.cc.columbia.edu>
	<ZIPPY.96Mar5010726@hairball.ecst.csuchico.edu>
	<4hkbmo$gou@apakabar.cc.columbia.edu>
Subject: Re: Kermit 3.14 + Kanji = troubles


In article <ZIPPY.96Mar8224419@hairball.ecst.csuchico.edu>
zippy@hairball.ecst.csuchico.edu (The Pinhead) writes:
: In article <4hkbmo$gou@apakabar.cc.columbia.edu>
: fdc@watsun.cc.columbia.edu (Frank da Cruz) writes:
:
::   set file character-set shift-jis           <-- (Irrelevant)
::   set terminal bytesize 8                    <-- Yes, you need this
::   set parity none                            <-- Ditto
::   set terminal character-set transparent     <-- Ditto
:: 
:: This is exactly the set of commands you need.  If it doesn't work, that
:: most likely means that CP982 is not the active code page, or that you don't
:: have DOS/V in Japanese mode.  Another possibility is that the host that
:: you are connecting to is not itself in 8-bit mode.  For example, if it were
:: a SunOS 4.x system, you would need to tell it to:
:: 
::   stty pass8
:: 
:: before you could see 8-bit characters (the Kanji bytes of Shift-JIS have
:: their 8th bits set to 1).  Use the equivalent command ("stty cs8" or 
:: whatever) on other versions of UNIX or other operating systems.
: 
: You've been really helpful, Frank!  Thanks...  However, just one
: problem remains...  MSKermit 3.14 seems to be remapping the line
: drawing characters.  The application is an ACUCobol program in
: shift-jis mode.  Using CKermit 5A(190) under Linux with Japanese
: extentions, the line drawing character 0xc4 displays properly as a
: horizontal bar, but under MSKermit 3.14 it displays as the katakana
: character "to" (0x44).
: 
Sorry for the delay in replying.  This is from our informant in Japan:

There are two problems:

1) There is no line drawing character in Japanese DOS/V character
   set (not only line drawing charters but also many symbols which
   are included US PC-DOS, e.g., copyright mark etc).

2) 0xc4 is officially defined as Katakana "to" in Shift-JIS code.
   If we change the mapping, it will cause many problems on many
   Japanese hosts where they use JIS-X-201 Katakana (Hankaku Katakana).

(End quote)

- Frank

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.os.linux.development.system,comp.std.internat,comp.terminals
Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!math.ohio-state.edu
      !howland.reston.ans.net!gatech!newsfeed.internetmci.com!in2.uu.net
      !nntp.inet.fi!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi
      !hurtta
From: hurtta@dionysos.fmi.fi (Kari E. Hurtta)
Date: 14 Apr 1996 10:29:40 GMT
Organization: Finnish Meteorological Institute (FMI)
Message-ID: <4kqk2k$5tm@kronos.fmi.fi>
References: <4k73al$2mo@portal.gmu.edu>
            <4k87g6$5n@cortex.dialin.rrze.uni-erlangen.de>
            <yv2bul3adbl.fsf@i44pc2.ipd.info.uni-karlsruhe.de>
            <4kandi$rm@cortex.dialin.rrze.uni-erlangen.de>
            <yv24tqub8r4.fsf@i44pc2.ipd.info.uni-karlsruhe.de>
            <4kdcda$1fe@cortex.dialin.rrze.uni-erlangen.de>
            <316eb9a7.13321510@news.ucs.ubc.ca>
In-Reply-To: Article <316eb9a7.13321510@news.ucs.ubc.ca> of Eric Gisin
Subject: Re: Linux and UNICODE?

[ Added comp.terminals as receiver. ]

ericg@unixg.ubc.ca (Eric Gisin) writes in comp.os.linux.development and
comp.std.internat:
<...>

| I thought stateful encodings were added to standard C at IBM's request, whose
| EBCDIC-based multibyte character sets require it. What is ISO 2022, and is
| anyone using it? I wouldn't want to see GNU libc implement something that's
| never going to be used.
<...>

It is something what is used for example in Digital VT series terminal
for selecting and managing character sets... Look for example VT300
series or newer.


Linux's console driver implementation does not count (if it is not better
after that when I last time looked it :-))


ISO 2022 is used also in Chinize and Japanise character sets.

So different parts of ISO 2022 are definately in wide use...


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Article 8995 of comp.dcom.telecom:
Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!telecom-request
Date: 26 Sep 93 16:03:52 GMT
From: jjmhome!pig!die@transfer.stratus.com (Dave Emery)
Newsgroups: comp.dcom.telecom
Subject: Re: Information Wanted on Six-bit Code
Reply-To: jjmhome!pig!die@transfer.stratus.com
Message-ID: <telecom13.665.2@eecs.nwu.edu>
Organization: Opinion Mongers Incorperated...
Sender: telecom@eecs.nwu.edu
Approved: telecom@eecs.nwu.edu
X-Submissions-To: telecom@eecs.nwu.edu
X-Administrivia-To: telecom-request@eecs.nwu.edu
X-Telecom-Digest: Volume 13, Issue 665, Message 2 of 12
Lines: 69

In article <telecom13.660.4@eecs.nwu.edu> johan@tts.lth.se (Johan M
Karlsson) writes:

> I just wonder if anybody know anything about the Six-bit code called
> TTS, that was used by many newspapers in the 70's to receive stories
> from the wire services. Like what does the letters TTS stand for?


	TTS standards for TeleTypeSetter.  Indeed it is a 6-bit code
which was developed by AT&T's now defunct Teletype subsidiary in the
early 50s as a means of inputing news stories direct to Linotype
machines.  As such it incorporates the special control characters that
operate Linotype machines such as upper rail and lower rail shifts and
em space and en space.

	Originally in the days long before computers in the pockets of
every reporter, the wire services had computerized systems that ran on
mainframes for creating formated stock tables, sports box scores,
racing information and other highly structured text.  Sending this
material in TTS code ready for direct input into a type casting
machine saved local newspapers the services of several compositors and
made it possible for them to publish reams of this sort of material at
low cost.

	Later, in the 60's and early 70s the wire services developed
computer programs to format (perform hyphenation and justification)
their regular news feed into standard newspaper columns using Linotype
control characters.  Many of the newspaper oriented wire service wires
(particularly the AP A wire) were transmitted in TTS code in this era
and could be directly input to a Linotype typesetting machine.

	TTS code was popular for wire service distribution for another
reason, it supported upper and lower case.  The earlier Baudot
alphabet only supported upper case which meant that a human being had
to worry about getting the case correct in transcribing stories into
type -- but TTS had the correct case already.

	TTS format paper tape in fact became a standard in the
printing industry for input to composition equipment of later
generations than Linotype machines.  TTS represented an alphabet for
encoding text formated for printing, and may still see some use for
this purpose today.

	Teletype developed a modification of their model 15 workhorse
wire service teleprinter to print TTS in upper and lower case on rolls
of Teletype paper; this machine was called the model 20 monitor
printer.  Many newspapers which did not actually use TTS input to
their typesetting machines for news stories used these machines to
print out stories in upper and lower case for later entry by human
compositors.

	Newspapers which used TTS input directly usually punched the
TTS into 6 level paper tape for off line entry into Linotype machines.
So a typical newspaper would have a monitor printer and a tape punch
on each of their tts wires.

	TTS wire transmissions were usually low speed (66 or 75 wpm)
at baud rates adjusted for the 8.42 element code.  This resulted in
some strange low baud rates that gave the designers of serial port
boards for early minicomputers fits.

	TTS was largely replaced in the mid 70s by the high speed
ASCII wire transmissions and by newspaper computerized composition
systems which could do hyphenation and justification automatically and
output text direct to optical typesetters.  Remnents of it survive,
however, in the standard ASCII format for transmitting wire service
news stories which incorperates ASCII versions of some of the special
typesetter control characters.

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.dcom.telecom
Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request
Date: Sun, 26 Sep 1993 14:10:51 -0500 (cdt)
Message-ID: <telecom13.664.15@eecs.nwu.edu>
Organization: TELECOM Digest
Sender: telecom@eecs.nwu.edu
Approved: telecom@eecs.nwu.edu
X-Submissions-To: telecom@eecs.nwu.edu
X-Administrivia-To: telecom-request@eecs.nwu.edu
X-Telecom-Digest: Volume 13, Issue 664, Message 15 of 15
From: Brian D McMahon <MCMAHON@AC.GRIN.EDU>
Subject: Re: kUPL@ TELEGRAFNYJ MODEM (095) 212-3937

> [Moderator's Note: This message came to me from Russia. I have no idea
> at all what he is saying, except I think it has to do with a BBS or
> public access site in Moscow. This was the entire text. Can someone
> read it to me?  PAT]

> sRO^NO KUPL@ TELEGRAFNYJ MODEM
> tEL: (095) 212-39-37  sIDORENKO sERGEJ.

Hi, Pat.  That would be "srochno kuplyu telegrafnyj modem," or
"urgently (want to) buy a telegraphic modem."  Signed by Sergej
Sidorenko.

I have no idea what a "telegraphic" modem is; I'm not up on the
technical terminology.  At a guess, the gentleman wants to buy a FAX modem.

The message text, BTW, is in a format known as KOI-7, one of several
mutually incompatible (sigh) methods of transmitting Russian Cyrillic
text over the net.  Upper and lower case are reversed, as you probably
guessed.


Brian McMahon        <MCMAHON@GRIN1.BITNET>        <MCMAHON@AC.GRIN.EDU>
Postmaster / Acad. Software Support   Grinnell College Computer Services
Grinnell, Iowa 50112 USA   Voice: +1 515 269 4901   Fax: +1 515 269 4936


[Telecom Moderator's Note: You think then a 'telegraphic modem' would be
a fax modem?  My thanks to the 27 other responses I received to this query.
I selected a few to use here which make a good representative sample
of the lot.  PAT]

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Article 8992 of comp.dcom.telecom:
Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request
Date: Sun, 26 Sep 1993 02:14:29 -0400
From: anarres!gaarder@TC.Cornell.EDU
Newsgroups: comp.dcom.telecom
Subject: kUPL@  TELEGRAFNYJ MODEM (095) 212-3937
Message-ID: <telecom13.664.14@eecs.nwu.edu>
Organization: TELECOM Digest
X-Telecom-Digest: Volume 13, Issue 664, Message 14 of 15


Passing that through a little transliteration program I wrote back
during the coup in the Soviet Union (remember then? I was glued to my
Usenet feed!)  produces:

Srochno kuplyu telegrafnyy modem
Tel: (095) 212-39-37  Sidorenko Sergey.

Which I read as offering to buy a modem. I'm not sure just what
"srochno" means in this context; my dictionary defines it as "of term;
to be paid at a fixed date; due; payable".  "Kuplyu" means "I buy"; I
don't know whether a "telegrafnyy modem" is a special kind of modem or
just a modem in general.

Why this is here is a puzzle; probably it was sent to the wrong address.


Steve Gaarder   gaarder@anarres.ithaca.ny.us


[Moderator's Note: Well no, it was not sent to the wrong address. He
wrote 'telecom-request@mintaka.lcs.mit.edu' which is just an alias
that points at me. That is, he did not post to a newsgroup where it
found its way to comp.dcom.telecom; some news program found it lacking
authorization and shoved it to me. He mailed it direct, albiet to an
alias I had forgotten existed, going back to the days of jsol. So he
must think we can do something for him. Fancy that; he wants to buy
a modem, and here I thought he was looking for publicity for his BBS
or similar and decided to give it to him.  PAT]

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: Wed, 12 Mar 1997 14:09:38 GMT
From: John Savard <seward@netcom.ca>
Newsgroups: alt.folklore.computers
Subject: Re: Pre-1988 Windows ECS

dski@cameonet.cameo.com.tw wrote:

>Anyone know what Windows used for an extended character set before
>ISO 8859-1 came along?

>Dan Strychalski
>dski@cameonet.cameo.com.tw

No, but ISO 8859-1 came along in 1985, since the Amiga used it then.
Only the plus and minus signs weren't agreed on (and, from the code
chart, there in a position that ought to be used for the OE and oe
ligatures, and, according to the manual for my inkjet printer, is so
used in some Unix character set).

The other possibility would have been to use the DOS character set, of
course, which is still used in some Windows fonts.

John Savard


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!news1.radix.net!news4.agis.net
      !www.nntp.primenet.com!nntp.primenet.com!news-feed.inet.tele.dk
      !news.nacamar.de!howland.erols.net!worldnet.att.net
      !cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsxfer3.itd.umich.edu
      !newsxfer.itd.umich.edu!yale!oitnews.harvard.edu!fas-news.harvard.edu
      !newspump.wustl.edu!newsreader.wustl.edu!not-for-mail
Date: Wed, 26 Mar 1997 17:53:44 -0600
Message-ID: <3339B708.7B98B902@artsci.wustl.edu>
References: <5ha085$mh0@reader.seed.net.tw>
From: Tom Stepleton <ssteplet@artsci.wustl.edu>
Subject: Re: Strange IBM glyphs (Was: Amiga)

dski@cameonet.cameo.com.tw wrote:
> I've heard it said the 5051 was conceived as a game machine. To look at
> some of the characters IBM assigned to values in the ASCII control range
> -- playing-card symbols and the like -- the idea doesn't seem so far-
> fetched. And a bunch of I/O addresses were designated as the "game port."

I wonder about this as well. Why did IBM use all of those bizarre
glyphs for the control characters? The smiley faces (01,02), the game
cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't
seem to serve too much of a purpose and aren't that easy for Joe BASIC
Game Programmer to put on the screen with only PRINT statements.

I remember hearing somewhere long ago that all these card symbols and
such originated on Wang word processing systems, but I don't trust my
memory...

--Tom
+-----------+---------------------------+            ____
| Stepleton | ssteplet@artsci.wustl.edu |>-------|\__/_/__
+-----------+---------------------------+        \________}

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov
      !news.indiana.edu!chi-news.cic.net!metro.atlanta.com
      !feeder.chicago.cic.net!newsrelay.netins.net!news.ececs.uc.edu
      !newsfeeds.sol.net!news.maxwell.syr.edu!supernews.com!news
Organization: All USENET -- http://www.Supernews.com
Message-ID: <333c5a9c.12736765@news.comland.com>
References: <5ha085$mh0@reader.seed.net.tw>
            <3339B708.7B98B902@artsci.wustl.edu>
Date: Sat, 29 Mar 1997 00:00:30 GMT
From: orestes@comland.com (William D. Leara)
Subject: Re: Strange IBM glyphs (Was: Amiga)


On Wed, 26 Mar 1997 17:53:44 -0600, Tom Stepleton
   <ssteplet@artsci.wustl.edu> wrote:
>
> I wonder about this as well. Why did IBM use all of those bizarre
> glyphs for the control characters? The smiley faces (01,02), the game
> cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't
> seem to serve too much of a purpose and aren't that easy for Joe BASIC
> Game Programmer to put on the screen with only PRINT statements.
>
> I remember hearing somewhere long ago that all these card symbols and
> such originated on Wang word processing systems, but I don't trust my
> memory...

You're right on the money.  Check out the October 2, 1995 edition of
FORTUNE magazine, specifically the interview with Paul Allen and Bill
Gates.  Bill says:

  "... we were also facinated by dedicated word processors from Wang,
  because we believed that general-purpose machines could do that just
  as well.  That's why, when it came time to design the keyboard for the
  IBM PC, we put the funny Wang character set into the machine--you
  know, smiley faces and boxes and triangles and stuff.  We were
  thinking we'd like to do a clone of Wang word-processing software
  someday."

-- 
William Leara
orestes@comland.com

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov
      !news.indiana.edu!vixen.cso.uiuc.edu!howland.erols.net!europa.clark.net
      !newsfeeds.sol.net!noc.nyx.net!nyx10.cs.du.edu!not-for-mail
Date: 29 Mar 1997 12:27:22 -0700
Organization: University of Denver, Dept. of Math & Comp. Sci.
Message-ID: <5hjqeq$raa@nyx10.cs.du.edu>
NNTP-Posting-Host: nyx10.nyx.net
From: snorwood@nyx10.cs.du.edu (Scott Norwood)
Subject: origins of '\' (backslash) on keyboards?


I remember reading with interest the thread on the origins of the '\'
as a directory separator for M$-DOS (as opposed to the '/' used in UNIX).

Now, here's another question:  at what point did the backslash key
become standard for computer keyboards?  It's not on my typewriter, nor
is it on the keyboard of my Apple II (whose keyboard is essentially the
same as a teletype terminal), but it does exist on early DEC terminals
(VT-100, etc.) and other equipment of the late-1970's vintage.  How
did this practice start?  Does it have any roots in the UNIX convention
of using the backslash to indicate that the following character should
be treated literally (as in referring to filenames with spaces or other
'weird' characters in them).?

-- 
Scott Norwood:  snorwood@nyx.net, snorwood@balloon.ml.org, senorw@mail.wm.edu
Lame Home Page #1:  http://balloon.ml.org/          <-- School year only
Lame Home Page #2:  http://www.nyx.net/~snorwood/   <-- Regular page

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de
      !europa.clark.net!newsfeed.internetmci.com!news1.infoave.net!usenet
Date: Sun, 30 Mar 1997 05:27:14 GMT
Organization: Fantasy Farm Fibers
Message-ID: <333df90f.107920290@news.swva.net>
References: <5hjqeq$raa@nyx10.cs.du.edu>
NNTP-Posting-Host: pem02-02.swva.net
From: bernie@rev.net (Bernie Cosell)
Subject: Re: origins of '\' (backslash) on keyboards?

snorwood@nyx10.cs.du.edu (Scott Norwood) wrote:
}
} I remember reading with interest the thread on the origins of the '\'
} as a directory separator for M$-DOS (as opposed to the '/' used in UNIX).
}
} Now, here's another question:  at what point did the backslash key
} become standard for computer keyboards?

It has been on computer keyboards for a very long time.  The early Model 33
Teletypes had forward-slash and reverse-slash on the keyboard.  At the time,
there was no particular preference for one over the other: the keyboard
just included both slashes.  [it also had "uparrow" and "backarrow", which
a later revision of ASCII (at the time the model 37 came out) changed to
caret and underscore, respectively].  Neither forward-slash nor
reverse-slash were added for the convenience of computers...

  /bernie\
--

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: 30 Mar 1997 20:44:25 GMT
From: "Douglas W. Jones,201H MLH,3193350740,3193382879"
     <jones@pyrite.cs.uiowa.edu>
Newsgroups: alt.folklore.computers
Subject: Re: origins of '\' (backslash) on keyboards?

From article <5hjqeq$raa@nyx10.cs.du.edu>,
by snorwood@nyx10.cs.du.edu (Scott Norwood):

> Now, here's another question:  at what point did the backslash key
> become standard for computer keyboards?  ...

The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory
serves), and anything that copied the Teletype character set verbatim also
had it.  The Model 33 was the first ASCII terminal, and the 64 character
subset of ASCII it supported (upper case only!) included the backslash.

                                        Doug Jones

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov
      !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu
      !feed1.news.erols.com!howland.erols.net!swrinde!ihnp4.ucsd.edu
      !newshub.nosc.mil!news!mshapiro
Organization: NCCOSC RDT&E Division, San Diego, CA
References: <5hjqeq$raa@nyx10.cs.du.edu> <859668981snz@tnglwood.demon.co.uk>
            <m3hghuth9j.fsf@phrodo.texas.net>
Message-ID: <1997Mar31.215753.24813@nosc.mil>
Date: Mon, 31 Mar 1997 21:57:53 GMT
From: Michael D Shapiro <mshapiro@nosc.mil>
Subject: Re: origins of '\' (backslash) on keyboards?

In article <m3hghuth9j.fsf@phrodo.texas.net>,
Al Castanoli  <afcasta@texas.net> wrote:
>Robert Billing <unclebob@tnglwood.demon.co.uk> writes:
>
>[...]
>
>:  The key was certainly on the old ASR33, long before there were
>: VT-anything terminals. I suspect that it antedates UN*X itself, and
>: goes back to the deep magic at the dawn of ASCII.
>
>It was not on the Model 28 ASR, though ... I remember having to put
>"backwards slash" in messages with Mod 28 ASR's and KSR's.  Probably
>a tradeoff in cramming ASCII into the Baudot bitstream.
>

The reverse solidus (back slash) showed up fairly early in ASCII code
development, which was (as I recall) in the early 1960s.  An excellent
background on the history of character sets is in the book "Coded
Character Sets" (I was about to give a more complete reference but
forgot the author and publisher).  Please let me know if you want a
more complete reference.

Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen
symbol in place of the reverse solidus.

-- 
Michael D. Shapiro, Ph.D.                   Internet: mshapiro@nosc.mil
Code 4123, NCCOSC RDT&E Division (NRaD)              San Diego CA 92152
Voice: (619) 553-4080         FAX: (619) 553-4808         DSN: 553-4080

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Date: 1 Apr 1997 16:12:05 GMT
From: BBReynolds <bbreynolds@aol.com>
Message-ID: <19970401161101.LAA26059@ladder01.news.aol.com>
Subject: Re: origins of '\' (backslash) on keyboards?


The complete reference is Charles E. Mackensie, <Coded Character Sets,
History and Development>, The Systems Programming Series, Reading,
Massachusetts, Addison-Wesley, 1980.  Chapters 12 and 13 cover the
development of ASCII; the reverse solidus a/k/a (or is that a\k\a??)
backslash was part of the original specification.

-- 
Bruce B. Reynolds, Systems Consultant:
Founder of Trailing Edge Technologies---
Sweeping Up Behind Data Processing Dinosaurs

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Date: 1 Apr 1997 22:02:06 GMT
Message-ID: <5hs0ku$mae$1@news.wizvax.net>
From: John Wilson <wilson@dbit.dbit.com>
Subject: Re: origins of '\' (backslash) on keyboards?

In article <5hmjb9$afo@flood.weeg.uiowa.edu>,
Douglas W. Jones,201H MLH,3193350740,3193382879 <jones@pyrite.cs.uiowa.edu>
wrote:
>The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory
>serves), and anything that copied the Teletype character set verbatim also
>had it.  The Model 33 was the first ASCII terminal, and the 64 character
>subset of ASCII it supported (upper case only!) included the backslash.

What particularly impressed me about that nasty little mechanical keyboard
on the 33 was that its method of generating control characters was consistent,
i.e. shift-K gave you "[" and if you wanted ESCape (^[) (N.B. NOT ALTMODE!),
you typed ctrl-shift-K.  And if I remember right, there was an interlock so
that when you had ctrl and shift down, only those data keys that would now
send something different would allow themselves to be pressed.  Kinda cute.

That answerback drum is something pretty special too...

-- 
John Wilson
0,3 @ SID

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov
      !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu!europa.clark.net
      !cpk-news-hub1.bbnplanet.com!cam-news-hub1.bbnplanet.com
      !news.bbnplanet.com!howland.erols.net!EU.net!news.eunet.fi
      !news.microdata.fi!nntp.inet.fi!news.sci.fi!usenet
Message-ID: <3340bb0f.65034735@news.sci.fi>
Date: Tue, 01 Apr 1997 09:03:07 GMT
From: Paul Keindnen <keinanen@sci.fi>
Subject: Re: origins of '\' (backslash) on keyboards?

mshapiro@nosc.mil (Michael D Shapiro) wrote:
>
>The reverse solidus (back slash) showed up fairly early in ASCII code
>development, which was (as I recall) in the early 1960s.  An excellent
>background on the history of character sets is in the book "Coded
>Character Sets" (I was about to give a more complete reference but
>forgot the author and publisher).  Please let me know if you want a
>more complete reference.
>
>Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen
>symbol in place of the reverse solidus.

While strictly speaking ASCII is a purely US standard, many national
7-bit character sets exist in the rest of the world, which are almost
identical to the ASCII character set, but a few character positions
are reserved for national variations. There are usually 6 to 9
character positions that differ from the ASCII representation.
Crosshatch character, code 35 (decimal), is in many character sets the
pound sign. Character codes after Z (91..94) and after z (123..126)
are used for national variants. The backslash (92) is in the Dutch
character set '1/2', while in the Finnish, German and Swedish
character sets it is upper case O with two dots, in the French and
Italian character set it is c with cedilla, in the Norwegian set it is
O with a slash and in  the Spanish set it is N with tilde.

Apparently when the terminal manufacturers had to make keyboards for
these languages and include keys for the "extra" characters, it was
not economical to manufacture keyboards with different number of keys
for each market, the extra keys in the US version were used to
generate the same character code as the foreign version, but was
labelled with the backslash etc. key cap, which otherwise would not
have "deserved" an own key.

Paul Keinanen

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Sender: <quipu-request@cs.ucl.ac.uk>
Message-Id: <9207241339.AA04105@skinfaxe.diku.dk>
Cc: <quipu@cs.ucl.ac.uk>
Date: Fri, 24 Jul 1992 15:39:12 +0200
From: Steen Linden <anubis@diku.dk>
Subject: Re: Character sets

In message <9207231551.AAgandalf08152@gandalf.uio.no>
   <h.b.furuseth@usit.uio.no> asked:
>
>  [about character sets in email-directory support]
>
> If not, where do I start?  Must I patch each DUA, or just some library?

The ISO8859-1 conversion stuff is in libcommon.a. Take a look at 
isode-8.0/dsap/common/string.c. The most interesting functions are
strprint() and iso8859print(). 

I haven't done any of the work you are requesting, though I could definitely
use it. I was just looking through the code in search of the T.61 version 
of our Danish common national letters.

Here they are by the way:

    T.61  X11 Keysym  ISO8859/1 ASCII
    ---------------------------------
    \f1   ae          0xE6      {
    \e1   AE          0xC6      [ 
    \f9   oslash      0xF8      |
    \e9   Ooblique    0xD8      \
    \caa  aring       0xE5      }
    \caA  Aring       0xC5      ]

--Steen

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Date: Thu, 03 Apr 1997 22:23:33 GMT
Message-ID: <5i1au5$bpj@tor-nn1-hb0.netcom.ca>
From: John Savard <seward@netcom.ca>
Subject: Re: origins of '\' (backslash) on keyboards?

In <3340bb0f.65034735@news.sci.fi>, keinanen@sci.fi (Paul Keindnen) wrote:
>
> While strictly speaking ASCII is a purely US standard, many national
> 7-bit character sets exist in the rest of the world, which are almost
> identical to the ASCII character set, but a few character positions
> are reserved for national variations.

Yes, and these character sets belong to International Telegraph
Alphabet No. 5, which is the international version of ASCII; so there
is a worldwide standard based on ASCII.

John Savard

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Message-ID: <5i1aop$bpj@tor-nn1-hb0.netcom.ca>
Date: Thu, 03 Apr 1997 22:20:40 GMT
From: John Savard <seward@netcom.ca>
Newsgroups: alt.folklore.computers
Subject: Re: origins of '\' (backslash) on keyboards?

snorwood@nyx10.cs.du.edu (Scott Norwood) wrote:

> I remember reading with interest the thread on the origins of the '\'
> as a directory separator for M$-DOS (as opposed to the '/' used in UNIX).

> Now, here's another question:  at what point did the backslash key
> become standard for computer keyboards?

Well, back in 1964, when the original ASR-33 Teletype was produced--
when ASCII was invented, in other words--the backslash was part of
the character set.

Back then, the caret was instead an up arrow (which it should have
remained, being useful as an exponentiation operator);
the underscore was an arrow pointing left;

and there were no lowercase characters; ` { | } and ~ did not exist
yet. However, in addition to DEL, the last few characters before it
were controls as well: ACK, ESC, and ALT MODE then are now printing
characters, and a different control character is used for ESC.

The last 8 of the first 32 characters did not have their present
meanings; they were just S0 through S7.

John Savard

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: alt.folklore.computers
Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de
      !news-feed.inet.tele.dk!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com
      !news-peer.sprintlink.net!news.sprintlink.net!sprint!howland.erols.net
      !spring.edu.tw!feeder.seed.net.tw!reader.seed.net.tw!!dski
Organization: Cameo Communications, Inc.
Message-ID: <5iujss$h0h@reader.seed.net.tw>
NNTP-Posting-Host: 192.72.104.4
Date: 15 Apr 1997 00:59:08 GMT
From: dski@cameonet.cameo.com.tw
Subject: Re: ASCII History - No Cents?

John Savard (seward@netcom.ca) wrote --

> Since ASCII was originally developed in 1963 for US use, the cents
> sign would perhaps have been a useful thing to include.

Both ISO and ASA (now called ANSI) began work on text encoding
standards in 1961. ASA being an ISO member body, it seems likely that
the work was coordinated to some degree. The 1963 ASA standard was for
a six-bit uppercase-only version of ASCII; seven-bit "ASCII" seems to
have been adopted first in *Europe*, in the form of ECMA-6, which the
European Computer Manufacturers Association approved in 1965.

The earliest seven-bit U.S. version I know of dates from 1968. This is
also the year in which President Johnson mandated ASCII for the federal
government's computer operations.

> I have felt that the following assignments would make a good 'National
> Use' version of ASCII for North American English-language word
> processing, considering the keyboard arrangement:

Which keyboard arrangement? Before IBM got into the microcomputer
sca^H^H^Hbiz, many non-alphanumeric marks were not where you find them
now. Most digital keyboards used ASCII-based pairings (["] with [2];
[&] with [6]; ['] with [7]; [(] and [)] with [8] and [9]; etc.). IBM
used a layout similar to that of the Selectric.

> Of course, I am deeply distressed both by the placement of the
> multiplication and division signs in eight-bit ASCII where OE and oe
> clearly belong...
>
> and, in the opposite direction, by the fact that the only standard
> eight-bit ASCII is totally devoted to foreign-language word
> processing. I wanted AND, OR, less than or equal to, not equal to,
> greater than or equal to, and other symbols useful for _programming
> languages_ ( X and -^H:, although useful for ALGOL, hardly count ) in
> an eight-bit ASCII. Along, of course, with the _Greek_ alphabet [...]

Code Page 437! You have it!

I've seen so many different characters used for AND, OR, NOT, et al, in
*typeset* material, I wonder if they just couldn't agree on which ones
to use. Did these get standardized when I wasn't looking?

> (although LLL 8-bit ASCII isn't perfect either, as APL should instead
> get a character set of its own).

"8-bit ASCII" is a contradiction in terms, I believe. What's LLL?

-- 
Dan Strychalski
<dski@cameonet.cameo.com.tw>

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!news.maxwell.syr.edu
      !news-peer.sprintlink.net!news.sprintlink.net!Sprint!howland.erols.net
      !europa.clark.net!newsfeed2!argos.tel.hr!news
Date: 7 May 1997 19:36:08 GMT
Organization: Croatian Post & Telecommunications
Message-ID: <01bc5b1d$4d7c4420$86e11dc3@gost.hr>
References: <01bc5620$cf1a97e0$62e31dc3@gost.hr>
      <01bc56f2$19bb2220$d2e31dc3@gost.hr> <5kepgr$rgf@neptune.theplanet.co.uk>
NNTP-Posting-Host: ac23-p1-zg.tel.hr
From: "Bernard Grgic" <bernard.grgic@zg.tel.hr>
Subject: Re: Need help with Fonts in Hyper Terminal Win95

Only monospaced (unproportional) fonts are listed in Hyper Terminal font
set, NOT every TT Font, as I expected in the beginnig.

I had to redesigne one of them (make Croatian characters instead of some
other characters like square brackets)

The problem was in a fact  that I need to communicate with UNIX through
code page CP437, not CP852, where I have Croatian characters, by default.

I have VGA driver for CP437 with (redesigned) Croatian characters, but it
does not work with programs in graphic mode.

Greetings, Bernard.

oliver st.john <olivers@pixel.co.uk> wrote in article
<5kepgr$rgf@neptune.theplanet.co.uk>...
>
> >Do not vaste your time reading the text below!
> >The problem has been solved, successfuly.
> >Bernard
>
> [
> [    HOW?  I'm sure a few of us would like to know...
> [
>
> >Bernard Grgic <bernard.grgic@zg.tel.hr> wrote in article
> >   <01bc5620$cf1a97e0$62e31dc3@gost.hr>...
> >> Hi,
> >> How can I choose other TT Font which is not specified in Hyper Terminal. I
> >> need to do it if I want to have Croatian characters on the screen, when
> >> communicate with UNIX. I have such characters in TT Fonts but they are not
> >> listed  in Hyper Terminal font list.
> >>
> >> If anyone can help or give me any suggestion, please, do it.
> >> Thank you.
> >>                  Bernard

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.lang.pl1
Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!newsfeeds.sol.net
      !news.maxwell.syr.edu!feed1.news.erols.com!news.ecn.uoknor.edu
      !munnari.OZ.AU!news.mel.connect.com.au!harbinger.cc.monash.edu.au
      !news.rmit.EDU.AU!goanna.cs.rmit.edu.au!not-for-mail
Message-ID: <5outhp$h6n$1@goanna.cs.rmit.edu.au>
References: <33B2DACD.4279@ix.netcom.com>
Organization: Comp Sci, RMIT University, Melbourne, Australia.
Date: 27 Jun 1997 09:21:29 +1000
From: rav@goanna.cs.rmit.edu.au (robin)
Subject: Re: CHARSET (48) vs CHARSET (60)

In <33B2DACD.4279@ix.netcom.com>, dneubart@ix.netcom.com writes:
>I'm trying to map the PL/I 48-character set to 60-character set.
>I haven't been able to match anything more than
>
>               .. (dot dot)    to      : (colon)
>               ,. (comma cot)  to      ; (semi-colon)
>
>Does anyone know the 48-character equivalents for
>
>               > (greater than)
>               < (less than)
>               | (logical or)
>               etc.


OTOMH,

> is GT
< is LT
| is OR
& is AND
|| is CAT
>= is GE
<= is LE
^= is NE
 ^ is NOT
-> is PT
^> is NG
^< is NL

Are there any others?

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.lang.pl1
Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu
      !ais.net!news.maxwell.syr.edu!howland.erols.net!psinntp
      !clothos.candle.com!phobos.candle.com!news
Organization: Candle Corporation
Message-ID: <33B2FBEE.E9E@candle.com>
References: <33B2DACD.4279@ix.netcom.com>
Date: Thu, 26 Jun 1997 16:31:58 -0700
From: Eric Jackson <Eric_Jackson@candle.com>
Subject: Re: CHARSET (48) vs CHARSET (60)

dneubart@ix.netcom.com wrote:
>
> I'm trying to map the PL/I  48-character set to 60-character set.

Boy, this takes me back.  These are the ones I can think of off hand:

  GT >
  LT <
  OR |
  LE <=
  GE >=
  NOT *
  NE *=
  CAT ||
  AND &

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!bloom-beacon.mit.edu!ai-lab
      !prep.ai.mit.edu!gnu
Newsgroups: gnu.announce,gnu.utils.bug,comp.unix.shell,
            comp.unix.programmer,comp.unix.misc
Followup-To: gnu.utils.bug
Message-ID: <9312240149.AA11561@icule.UUCP>
To: info-gnu@prep.ai.mit.edu
Date: Fri, 24 Dec 93 01:49:20 GMT
From: pinard%icule.UUCP@iro.umontreal.ca (Francois Pinard)
Subject: Release: GNU recode version 3.3
Approved: info-gnu@prep.ai.mit.edu


Here is my Christmas gift to GNU users.  Best wishes to all of you!

GNU recode 3.3 should soon be available on prep.ai.mit.edu, as file
pub/gnu/recode-3.3.tar.gz.  All reported bugs have been corrected.
Thanks to all those who contributed comments or suggestions.

recode converts files between character sets and usages.  When exact
transliterations are not possible, it may get rid of the offending
characters or fall back on approximations.  This program recognizes or
produces nearly 150 different charsets, able to transliterate files
between almost any pair.  Most RFC 1345 charsets are supported.

Please report bugs to:
        bug-gnu-utils@prep.ai.mit.edu

Here is a list of user visible changes from version 3.2.4:

* Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added.
* Also, most RFC 1345 charsets and aliases are handled.  That's a bunch!
* Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead.
* Old maci disappears because of RFC 1345's macintosh, use applemac instead.
* Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead.
* Recoding between latin1, ibmpc and applemac is (almost) reversible.
* The texinfo documentation has been reorganized, this to be continued.
* Long options are accepted, charset names may be abbreviated.
* Option --list (-l) displays charsets, aliases and contents in many formats.
* Option --strict (-s) asks for stricter, non-reversible recodings.
* Option --graphics (-g) approximates ibmpc rulers with ASCII graphics.
* Option --header (-h) produces C source for many recoding tables.
* Option --auto-check (-a) reports about all possible recodings.
* Option --ignore (-x) prevents a charset from being selected.
* Execution has been sped up through step merging, hashing for charset names.
* Many various buglets have been eradicated, portability increased.
* Charsets may be edited out by modifying the Makefile only.
* Configuration is made through the use of an external config.h file.
-- 
Franc,ois Pinard        ``Vivement GNU!''       pinard@iro.umontreal.ca
About the League for Programming Freedom?  Email me or lpf@uunet.uu.net


[ Most GNU software is packed using the new `gzip' compression program.
  Source code is available on most sites distributing GNU software.

  For information on how to order GNU software on tape, floppy, or
  cd-rom, check the file etc/ORDERS in the GNU Emacs distribution or in
  GNUinfo/ORDERS on prep, or e-mail a request to: gnu@prep.ai.mit.edu

  By ordering your GNU software from the FSF, you help us continue to
  develop more free software.  Media revenues are our primary source of
  support.  Donations to FSF are deductible on US tax returns.

  The above software will soon be at these ftp sites as well.
  Please try them before prep.ai.mit.edu!   thanx -gnu@prep.ai.mit.edu
	ASIA: ftp.cs.titech.ac.jp, utsun.s.u-tokyo.ac.jp:/ftpsync/prep,
  cair.kaist.ac.kr:/pub/gnu
	AUSTRALIA: archie.au:/gnu (archie.oz or archie.oz.au for ACSnet)
	AFRICA: ftp.sun.ac.za:/pub/gnu
	MIDDLE-EAST: ftp.technion.ac.il:/pub/unsupported/gnu
	EUROPE: irisa.irisa.fr:/pub/gnu, ftp.univ-lyon1.fr:pub/gnu,
  ftp.mcc.ac.uk, unix.hensa.ac.uk:/pub/uunet/systems/gnu,
  src.doc.ic.ac.uk:/gnu, ftp.win.tue.nl, ugle.unit.no, ftp.denet.dk,
  ftp.informatik.rwth-aachen.de:/pub/gnu, ftp.informatik.tu-muenchen.de,
  ftp.eunet.ch, nic.switch.ch:/mirror/gnu, ftp.funet.fi:/pub/gnu, isy.liu.se,
  ftp.stacken.kth.se, ftp.luth.se:/pub/unix/gnu, archive.eu.net
	WESTERN CANADA: ftp.cs.ubc.ca:/mirror2/gnu
	USA: wuarchive.wustl.edu:/mirrors/gnu, labrea.stanford.edu,
  ftp.digex.net:/pub/gnu, ftp.kpc.com:/pub/mirror/gnu,
  ftp.cs.widener.edu, uxc.cso.uiuc.edu, ftp.hawaii.edu:/mirrors/gnu,
  ftp.cs.columbia.edu:/archives/gnu/prep, col.hp.com:/mirrors/gnu,
  gatekeeper.dec.com:/pub/GNU, ftp.uu.net:/systems/gnu
]

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: 19 Mar 1998 08:53:34 GMT
From: "Casper H.S. Dik - Network Security Engineer"
     <Casper.Dik@Holland.Sun.Com>
Newsgroups: comp.unix.solaris
Subject: Re: Special Symbol Characters (copyright, trademark, etc.)?
 
[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]
 
R!ch <richardt@uk.sun.com> writes:
 
>On Wed, 18 Mar 1998, Akira Hangai wrote:
 
>> How could I type a special symbol character such as copyright,
>> trademark, registered, etc., in a program like Text Editor, Netscape
>> Mail, or even ShellTool/Dt Terminal?
 
>There's a section in one of the manuls that shows the Compose key
>sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin=
>gly,
>I've forgotten which one.  IIRC, you have to using an 8 bit locale
>to display the characters.
 
There's also the table in /usr/openwin/share/include/X11/Suncompose.h
 
ComposeTableEntry compose_table[] = { ...
 
Of course, as a native dutch person I miss teh ability to use
compose i-j to create a "^?"  (y with diaresis, compose " y)
 
Note that you can also use the reverse of the compositions as the
lookup routine will first sort the two characters on ascii values and then
lookup he entry in the table.
 
Casper
 
 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: 21 Mar 1998 00:41:37 GMT
From: "Richard L. Hamilton" <rlhamil@smart.net>
Newsgroups: comp.unix.solaris
Subject: Re: Special Symbol Characters (copyright, trademark, etc.)?


In article <Pine.SOL.3.96.980319080212.13918d-100000@paddington>,
        R!ch <richardt@uk.sun.com> writes:
> On Wed, 18 Mar 1998, Akira Hangai wrote:
> 
>> How could I type a special symbol character such as copyright,
>> trademark, registered, etc., in a program like Text Editor, Netscape
>> Mail, or even ShellTool/Dt Terminal?
>
>
> There's a section in one of the manuls that shows the Compose key
> sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin=
> gly,
> I've forgotten which one.  IIRC, you have to using an 8 bit locale
> to display the characters.
> 


 
Or just look at the compose_table[] initializer in
/usr/openwin/include/X11/Suncompose.h

 
You don't even have to understand C to figure that one out.




> --
> R!ch (Email is flakey at present: use richardt@keaton.uk.sun.com)

>          |  Richard Teer                richard.teer@uk.sun.com  |
>          |                          WWW: www.rkdltd.demon.co.uk  |

ftp> get |fortune
377 I/O error: smart remark generator failed
Bogonics: the primary language inside the Beltway
mailto:rlhamil@mindwarp.smart.net  http://www.smart.net/~rlhamil

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 

 -------------------------------------------------------------------


Date: 27 Mar 1998 09:12:37 GMT
From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 
Bioengineer-Financier)
Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users,
    soc.culture.greek
Subject: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts

        I have successfully read Greek from Lynx on Windows using the
Courier New Greek font from www.hri.org/fonts about two years ago -
(yet it did not work with the www.goarch.org Greek pages - at least
not back then - and they told me that since it worked with Windows,
they weren't going to bother). But I have never succeeded in doing
this with DOS and MSKermit. I have only tried this using the abcgrl
program (I thought it should work because kdp works the same way for
Japanese). Perhaps I have been setting the character set wrong in Lynx
(heck if I even remember what I used in Windows). HRI had a codepage
which doesn't look like the typical IBM codepages in name (I forget
right now) so I was confused if it would work and if indeed IBM offers
a different codepage inside Greece. Where would I be able to find the
standard IBM codepage in the USA (preferably on the web)?  Is it
possible that the problem with the IBM codepage is that it uses an
older format and not the (ISO) ELOT 928?  (I saw the Greek fonts for
Win 3.11 and I don't recall seeing it there, but I might be wrong.)
Please excuse my confusion.

                                - = -
Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC
   BachMozart ReaganQuayle EvrytanoKastorian http://WWW.Dorsai.Org/~vjp2
               vjp2@{MCIMail.Com|CompuServe.Com|Dorsai.Org}
   ---{Nothing herein constitutes advice. Everything fully disclaimed.}---


 ----------------------------------------------------------------------

Date: 27 Mar 1998 15:33:44 GMT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users,
    soc.culture.greek
Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts

MS-DOS Kermit has no built-in support for Greek.  You would need to find
and load a Greek code page that agrees with the host encoding, and then
use "set terminal character set transparent".

: Where would I be able to find the
: standard IBM codepage in the USA (preferably on the web)?
:
Good question.  If you find an answer, please be sure to post it.

By the way, Kermit 95 does support Greek: both ELOT 927 and 928.

- Frank

-------------------------------------------------------------

Date: 30 Mar 1998 10:13:53 GMT
From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 
Bioengineer-Financier)
Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users,
    soc.culture.greek
Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts
Followup-To: comp.protocols.kermit.misc,grk.forthnet.users,soc.culture.greek

        I got gauss.cpi from www.hri.org/fonts but don't know if
there's a way to make MS-Kermit alone use it.  Since it doesn't work
the way the MS-DOS help files say it shoul (but I don't have the Greek
files the MS-DOS help files say I need - I'm told only NT 4.0 DOS is
country-blind and has them all on one version - I was wondering if
there is a web page to find them at? Or do I have to go out and buy
"Greek MS-DOS"?).  KDP is a Japanese-font utility which allows Kermit
to read Japanese (You run Kermit with the DOS command "KDP MSKERMIT")
I have used the ABCGRL.COM utility (TSR?)  to use ELOT fonts in Emacs,
VEdit and other DOS programs, but it doesn't seem to work with Kermit
(linking to Lynx). In Windows, connecting to Lynx with a terminal
emulator, works ok if I set IBM CODEPAGE and RAW MODE.  But in
MSKermit, this just beeps alot and splatters all over the page when
Greek text is encountered. I am also confused, so sorry in advance.

                                - = -
Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC

 ------------------------------------------------------------


Date: 30 Mar 1998 15:36:09 GMT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users,
    soc.culture.greek
Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts

In article <6fh6l5$1du@news3.euro.net>,
Denis Liigeois <denis.liegeois@euronet.be> wrote:
: Just a question: ELOT 928 is ISO-8859-7. What is ELOT 927 ?
: 
It is a 7-bit set (like ASCII) in which the lowercase Roman letters
are replaced by uppercase Greek letters.

- Frank


-





--------------------------------------------------------------


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Message-ID: <567ce$a2a1b.32c@news.kea.bc.ca>
X-Newsreader: Microsoft Outlook Express 4.72.2106.4
X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4
Date: Wed, 6 May 1998 10:42:03 -0700
From: "Michael Simms" <michaesi@attachmate.com>
Subject: New Euro Currency symbol and DEC terminals
 
Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will
support the new Euro Currency symbol. Such as which position in the
character sets. Any information would be appreciated.


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: 6 May 1998 23:34:42 GMT
From: "T.E.Dickey" <dickey@shell.clark.net>
Newsgroups: comp.terminals
Subject: Re: New Euro Currency symbol and DEC terminals
 
Michael Simms <michaesi@attachmate.com> wrote:
: Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will
: support the new Euro Currency symbol. Such as which position in the
: character sets. Any information would be appreciated.
without a hardware change, they won't (it's not an ISO-8859-1 character).
 
-- 
Thomas E. Dickey
dickey@clark.net
http://www.clark.net/pub/dickey

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Date: 7 May 1998 01:55:26 GMT
From: Jeffrey Altman <jaltman@watsun.cc.columbia.edu>
Newsgroups: comp.terminals
Subject: Re: New Euro Currency symbol and DEC terminals
 
In article <6iqs2i$agh$2@clarknet.clark.net>,
T.E.Dickey <dickey@shell.clark.net> wrote:
:
: Michael Simms <michaesi@attachmate.com> wrote:
: :
: : Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will
: : support the new Euro Currency symbol. Such as which position in the
: : character sets. Any information would be appreciated.
:
: without a hardware change, they won't (it's not an ISO-8859-1 character).
 

It will have to be supported as a soft character set or by the addition
of additional character sets such as ISO-8859-15 which do include the Euro.
 
FYI, Kermit 95 1.1.17 will support all of the new ISO and IBM Code Page
character sets, which include the "Euro".

-- 
    Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
                 The Kermit Project * Columbia University
              612 West 115th St #716 * New York, NY * 10025
  http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Newsgroups: comp.terminals
Organization: Flashnet Communications, http://www.flash.net
Sender: sturgeon@199.165.143.240
Message-ID: <35a101b0.1820019719@news.flash.net>
Date: Mon, 06 Jul 1998 17:02:13 GMT
From: JonS@futuresoft.com (Jon Stugeon)
Subject: Euro currency symbol & dumb terminals/emulators



All,

I've recently been reading about support for the new Euro currency
symbol in the Windows 95/98 & NT O/Ss.  This got me thinking if there
will be any kind of standard for how legacy host applications will
represent the Euro symbol.

Obviously if the final display device is a physical dumb terminal (eg
VT-220) then it won't know anything about the Euro symbol, but if an
emulator is being used then it could be configured to display the Euro
symbol in place of an existing character.

So, which character would be replaced?  My guess is that this would be
done on an ad-hoc, host-to-host basis, but I'd be glad to be put
right.

Regards,
Jon Sturgeon
JonS@futuresoft.com


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Organization: Columbia University
Message-ID: <6nr10p$sf$1@apakabar.cc.columbia.edu>
References: <35a101b0.1820019719@news.flash.net>
NNTP-Posting-Host: watsun.cc.columbia.edu
Date: 6 Jul 1998 17:20:25 GMT
From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman)
Subject: Re: Euro currency symbol & dumb terminals/emulators

In article <35a101b0.1820019719@news.flash.net>,
Jon Stugeon <JonS@futuresoft.com> wrote:
: All,
: 
: I've recently been reading about support for the new Euro currency
: symbol in the Windows 95/98 & NT O/Ss.  This got me thinking if there
: will be any kind of standard for how legacy host applications will
: represent the Euro symbol.
: 
: Obviously if the final display device is a physical dumb terminal (eg
: VT-220) then it won't know anything about the Euro symbol, but if an
: emulator is being used then it could be configured to display the Euro
: symbol in place of an existing character.
: 
: So, which character would be replaced?  My guess is that this would be
: done on an ad-hoc, host-to-host basis, but I'd be glad to be put
: right.
: 
: Regards,
: Jon Sturgeon
: JonS@futuresoft.com
: 


Character-sets (including those with Euro support) are defined by
the ISO as part of standard 8859.  These are to be used by the host.

IBM has defined new code pages for the inclusion of the Euro and
Microsoft has added the Euro to its existing code pages.

Emulators should not make up their own.


    Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
                 The Kermit Project * Columbia University
              612 West 115th St #716 * New York, NY * 10025
  http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman)
Newsgroups: comp.terminals
Subject: Re: Euro currency symbol & dumb terminals/emulators
Date: 8 Jul 1998 20:04:28 GMT
Organization: Columbia University
Lines: 62
Message-ID: <6o0jcc$pqd$1@apakabar.cc.columbia.edu>
References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net>
NNTP-Posting-Host: watsun.cc.columbia.edu

In article <35a3a03e.87610587@news.flash.net>,
Jon Stugeon <JonS@futuresoft.com> wrote:
:
: So you're expecting users of host applications to use character
: translation features built-into terminal emulation software to choose
: to map an arbitrary character in the *host* character set to the
: appropriate character representing the Euro in the code page they are
: using in their display font?  Then if the host application needed to
: display the Euro symbol *in addition* to an existing currency symbol
: it would need to be modified to be aware of the configuration of the
: user's emulation package?
: 
: Surely if there was some kind of standard agreed upon for which
: character the host applications will use to represent the Euro then we
: wouldn't have another of those cases where the user doesn't get the
: correct symbol just because his emulation software isn't configured
: correctly.
: 
: Or have I got hold of the wrong end of the stick here?

The way that terminals (and emulation software) is supposed to
work is that the host application instructs the terminal as to
which character-set(s) should be loaded into the G0,G1,G2, and
G3 character-set tables.  These tables are then used to map
a byte from the host to a particular character for display.
If the local system does not support the character-sets used by
the application, it must perform translation to a character-set that
it does support.

There are international standards for all of this.  The ISO defined
ISO 2022 more than 20 years ago to address the host to terminal
assignment of character-sets and the mechanisms for switching
between them.

ISO 8859 defines the agreed upon International character-sets.
Part 15 declares the newly formed Western European character set
which includes the Euro.  

IBM maintains the Code Page Registry.  As such they introduced new
code pages for both ASCII and EBCDIC systems that include the Euro
for use in their operating systems (DOS and OS/2 on the PC; OS/400;
OS/390, ...).

Microsoft maintains its own Code Pages for Windows which are registered
with IBM as Code Pages 1250-1258.  These are based on the ISO 8859
character-sets but include printable characters in the C1 range.

And then, of course, Unicode has defined a position for the Euro in 
version 2.2 of that standard.  (0x20AC)

ISO 2022 was used as the basis for the character-set handling 
ANSI X3.64-1979 (since withdrawn) which was the basis for most 
Unix consoles and the DEC VT terminal line.  It is also the basis
of ISO-6429, which is the international standard which replaced
ANSI X3.64-1979.

Since FutureSoft is a manufacturer of terminal emulation software
I would have expected you to know all this.  How can Dynacomm 
emulate a VT terminal if it doesn't support this functionality?

    Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
                 The Kermit Project * Columbia University
              612 West 115th St #716 * New York, NY * 10025
  http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Sender: sturgeon@199.165.143.240
Message-ID: <35a4fe63.111711372@news.flash.net>
References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu>
NNTP-Posting-Host: 199.165.143.240
Date: Wed, 08 Jul 1998 23:23:26 GMT
From: JonS@futuresoft.com (Jon Stugeon)
Subject: Re: Euro currency symbol & dumb terminals/emulators


On 8 Jul 1998 20:04:28 GMT, jaltman@watsun.cc.columbia.edu (Jeffrey
Altman) wrote:

>Since FutureSoft is a manufacturer of terminal emulation software
>I would have expected you to know all this.  How can Dynacomm 
>emulate a VT terminal if it doesn't support this functionality?


Thanks for the comprehensive reply, Jeffrey.

DynaComm indeed emulates a VT terminal, including support for NRCs,
DEC Supplemental/Graphics etc etc, but that does necessarily mean that
everybody that works for the manufacturer has the benefit of and
understands the years of history behind character set development.
Furthermore, not everybody that works for FutureSoft works in terminal
emulation.

I am trying to understand what, if any, modifications would be
necessary to provide "support for the Euro", that is the reason for my
original enquiry.

Regards,
Jon Sturgeon
JonS@futuresoft.com


 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
Date: 9 Jul 1998 04:58:49 GMT
Organization: Columbia University
Message-ID: <6o1im9$cgj$1@apakabar.cc.columbia.edu>
References: <35a101b0.1820019719@news.flash.net> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu> <35a4fe63.111711372@news.flash.net>
NNTP-Posting-Host: watsun.cc.columbia.edu
From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman)
Subject: Re: Euro currency symbol & dumb terminals/emulators


In article <35a4fe63.111711372@news.flash.net>,
Jon Stugeon <JonS@futuresoft.com> wrote:
:
: DynaComm indeed emulates a VT terminal, including support for NRCs,
: DEC Supplemental/Graphics etc etc, but that does necessarily mean that
: everybody that works for the manufacturer has the benefit of and
: understands the years of history behind character set development.
: Furthermore, not everybody that works for FutureSoft works in terminal
: emulation.
: 
: I am trying to understand what, if any, modifications would be
: necessary to provide "support for the Euro", that is the reason for my
: original enquiry.


I apologize for assuming more knowledge than you have at your disposal.

I assumed (obviously incorrectly) that either you would be asking this
question because you are somehow involved in your company's terminal
emulation development; or that you have spoken to your own developers
before asking this query on the Net.

While I am a strong believer is the open sharing of knowledge, I must
admit that I am a bit hesitant to provide a direct competitor with
information that will help it takes sales away from my product. On the
other hand, I couldn't let someone comes up with yet another hack
solution (that I would end up needing to support for a customer in
five years) because of ignorance.

Hope this thread has been useful.

-- 
    Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
                 The Kermit Project * Columbia University
              612 West 115th St #716 * New York, NY * 10025
  http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org

 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Newsgroups: comp.terminals
References: <8aoi4p$uv1$1@nnrp1.deja.com>
Message-ID: <8aooiv$2i5$1@newsmaster.cc.columbia.edu>
Date: 15 Mar 2000 19:33:51 GMT
Organization: Columbia University
From: Jeffrey Altman <jaltman@watsun.cc.columbia.edu>
Subject: Re: change from ascii to ansi character set in DOS window

In article <8aoi4p$uv1$1@nnrp1.deja.com>,  <tedsung6674@my-deja.com> wrote:
: Hi,
:
: I am running NT 4.0 SP3.  My DOS window currently is displaying
: the ASCII character set.  However I want it to display the ANSI
: character set.  How do I do this?
:

The Console window is Unicode based.  The font that is displayed
is Unicode if you are using a TrueType font such as LucidaConsole or
Code Page based (CP437, CP850, ...) if you are using raster fonts.

The console application has a choice of writing to the screen
using the active Code Page or Unicode.  NT provides the proper
translations.

CP1252 is the Windows variation of ISO-Latin1 that you refer to as ANSI.

To use this code page in your application, use SetConsoleCP() and
SetConsoleOutputCP().

    Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
                 The Kermit Project * Columbia University
              612 West 115th St #716 * New York, NY * 10025
  http://www.kermit-project.org/k95.html * <kermit-support@kermit-project.org>

 //////////////////////////////////////////////////////////////////////////////
Newsgroups: comp.terminals
References: <8kc2os$fkm$1@as102.tel.hr>
Date: Mon, 10 Jul 2000 18:43:50 GMT
Organization: @Home Network
Newsgroups: comp.terminals
Message-ID: <GPoa5.58015$lU5.386894@news1.rdc1.nj.home.com>
From: dls2 <dlshearer@home.com>
Subject: Re: Setting keyboard over esc sequences on VT510/520

"IdrEASY" <IdrEASY@bigfoot.com> wrote:
> Hi!
>
> I need to set Croatian keyboard over esc sequence (SCS=Croatian/Slovenian
> latin).
> Also I know that "ESC(&3" are sequence for Russian cyrilic.
>
> I wrote little program with double loop and generate esc calls, but only
> have success for
> Russian keyboard.
>
> Please, help me.
>
> Bye!

The "(" represents the (94-character) G0 character set.
The ")" represents the (94-character) G1 character set.
The "*" represents the (94-character) G2 character set.
The "+" represents the (94-character) G3 character set.

The "-" represents the (96-character) G1 character set.
The "." represents the (96-character) G2 character set.
The "/" represents the (96-character) G3 character set.

Russian NRCS is "&5", not "&3".
SCS NRCS is "%3".

So you should be using "ESC(%3".


  --  Derrick Shearer

 //////////////////////////////////////////////////////////////////////////////

Newsgroups: comp.terminals
References: <8kc2os$fkm$1@as102.tel.hr>
    <GPoa5.58015$lU5.386894@news1.rdc1.nj.home.com> <8kho8g$2lr$1@as102.tel.hr>
    <8khpnp$im8$3@news1.Radix.Net>
Date: Wed, 12 Jul 2000 14:04:44 +0100
Organization: RDEL
Newsgroups: comp.terminals
Message-ID: <396C6CEC.ABE87C5E@rdel.co.uk>
From: Paul Williams <flo@rdel.co.uk>
Subject: Re: Setting keyboard over esc sequences on VT510/520

Thomas Dickey wrote:
>
> IdrEASY <IdrEASY@bigfoot.com> wrote:
> >>
> >> Russian NRCS is "&5", not "&3".
> >> SCS NRCS is "%3".

>
> > Sorry, on my terminal (&4 is Russian, but (%3 does nothing.
>
> what type of terminal is that?
> (I'm assuming vt220)

It says VT510/520 on the subject line, Tom. (Yes, I hate it when vital
information is only mentioned on the subject line!)

 //////////////////////////////////////////////////////////////////////////////


Message-ID: <slrn8tr1eu.6lp.vek@pharmnl.ohout.pharmapartners.nl>
References: <lvxC5.6624$DX4.203313@wagner.videotron.net>
    <Pine.OSF.4.21.0010032041570.169204-100000@goedel2.math.washington.edu>
    <Pine.PCP.3.91.1001004233418.1498B-100000@[204.111.21.127]>
    <Pine.OSF.4.21.0010050852230.8434-100000@pearl.cs.sc.edu>
    <Pine.PCP.3.91.1001005115632.1498A-100000@[204.111.54.33]>
    <Pine.OSF.4.21.0010051323301.8809-100000@pearl.cs.sc.edu>
NNTP-Posting-Host: mail.pharmapartners.nl
Newsgroups: comp.mail.pine
Date: 6 Oct 2000 07:57:19 GMT
From: Villy Kruse <vek@pharmnl.ohout.pharmapartners.nl>
Subject: Re: Pine and French characters

On Thu, 5 Oct 2000 13:29:02 -0400, Gopi Sundaram <gopalan@cs.sc.edu> wrote:
>On Thu, 5 Oct 2000, Samuel W. Heywood wrote:
>
>> If the character set used in Windows is not backward-compatible
>> with DOS, then Windows does not adhere to the standard.
>
>I don't know what standards you are talking about, but I'm glad that
>Windows finally used the ISO standard, whereas DOS didn't.
>


Well, actualy windows tries to "improve" on iso-8859-1 and calls that
windows-1252.  The difference is that some values in the range 0x80
to 0x9f has been assigned to characters, which are missing in iso-8859-1
For example the euro sign is 0x80 in win1252, but doesn't exist in
iso-8859-1.  However it will be 0xA4 in iso-8859-15 aka latin-9.

Check the alphabet soup at 

    http://www.czyborra.com/

and see how standard the
various standards really are.

Villy

 //////////////////////////////////////////////////////////////////////////////


Newsgroups: comp.mail.pine
Message-ID: <Pine.GHP.4.21.0010091245080.12230-100000@hpplus03.cern.ch>
References:
    <Pine.OSF.4.21.0010032041570.169204-100000@goedel2.math.washington.edu>
    <Pine.PCP.3.91.1001004233418.1498B-100000@[204.111.21.127]>
    <Pine.OSF.4.21.0010050852230.8434-100000@pearl.cs.sc.edu>
    <Pine.PCP.3.91.1001005115632.1498A-100000@[204.111.54.33]>
    <Pine.GHP.4.21.0010051851170.3117-100000@hpplus03.cern.ch>
    <Pine.PCP.3.91.1001005170824.1498D-100000@[204.111.24.209]>
    <nhtcapri-ya02408000R0610001552130001@news.rrzn.uni-hannover.de>
    <Pine.PCP.3.91.1001007204737.1498A-100000@[204.111.54.179]>
Date: Mon, 9 Oct 2000 12:54:19 +0200
Organization: Knights of the Round Tuit
From: "Alan J. Flavell" <flavell@mail.cern.ch>
Subject: Re: Pine and French characters

On Sat, 7 Oct 2000, Samuel W. Heywood wrote:

> Thanks a lot for the URL.  Now that I've read about QUOTED PRINTABLE I
> understand that it probably would be best to load a code page for
> handling this ISO-8859-1 character set.

Excuse me but you're not quite with us yet.

You would certainly be advised to load a code page that covers the
Latin-1 repertoire; but the recommendation would be to load the cp850
code page, which covers this repertoire but it's _not_ the iso-8859-1
character coding itself.  PINE knows how to mediate between the two,
as we've already covered in this thread.

There is a relatively obscure code page, cp819, which represents the
iso-8859-1 character coding.  However, if you load it, you are going
to find quite a number of conventional DOS applications displaying
bizarre characters in their menus etc, instead of the DOS "box
drawing" characters which they expected.

You'll find some brief (and old) notes of mine here
http://ppewww.ph.gla.ac.uk/~flavell/iso8859/iso8859-pointers.html#cp819
but I don't recommend that.  Unless you have some special requirement
that we haven't discussed here, I recommend that you use cp850.

cheers

Alan

 //////////////////////////////////////////////////////////////////////////////

