Character Set News =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.dcom.telecom Path: cs.utk.edu!emory!europa.eng.gtefsd.com!howland.reston.ans.net !spool.mu.edu!telecom-request Date: 29 Oct 1993 12:20 -0600 Message-ID: X-Telecom-Digest: Volume 13, Issue 725, Message 1 of 8 From: Rob Slade Subject: Book Review: "The Unicode Standard" BKUNICOD.RVW 980921 Addison-Wesley Publishing Co. P.O. Box 520 26 Prince Andrew Place Don Mills, Ontario M3C 2T8 416-447-5101 fax: 416-443-0948 or 1 Jacob Way Reading, MA 01867-9984 800-527-5210 617-944-3700 or 5851 Guion Road Indianapolis, IN 46254 800-447-2226 or Unicode, Inc. 1965 Charleston Road Mountain View, CA 94043 (415) 961-4189 Fax: (415) 966-1637 "The Unicode Standard", U$32.95/C$42.95 In the dim and distant past, the late (and generally unlamented) SUZY Information System was born in Vancouver. Rather an oddball as far as online services went, one "feature" was that the programmer had tried to allow for the use of all of the IBM graphics characters. This led to an entirely new field of "smiley" or "emoticon" (emotional icon) endeavours. Instead of the usual sideways happy face of the colon, hyphen and right parenthesis; ":-)"; we were able to use the "Ctrl-A" alternative of the IBM PC character set. Having a decimal value of one, this character is an upright happy face. This allowed other expansions, such as Ctrl-A and the right square bracket, which looks like a face and a telephone handset, and was used (usually in the "chat" modes) for "I am on the phone." "How nice," I hear you mutter between clenched teeth. "Can we now get on with the review?" Patience, stout nerds. This *is* the review. As SUZY users, particularly those who had been introduced to computer communications on the system, moved on to other services or local bulletin boards, they were usually quite shocked to find that their favourite symbols no longer worked. The little diamond (Ctrl-C) would kill a message on a VAX. Fidonet users might find that the cute tagline they had formed from graphics characters completely disappeared when they sent the message through an Internet gateway. ASCII (the American Standard Code for Information Interchange) is widely, and mistakenly, believed to define two hundred and fifty-six characters. It doesn't. Furthermore, of the hundred and twenty-eight characters it does define, many are "control" rather than printable characters. (The "card suit" symbols on the IBM PC graphics set are defined as "end of text", "end of transmission", "enquiry" and "acknowledgement" under the real ASCII standard.) In addition, many believe ASCII to be a universal standard; also not true. An octet with the decimal value thirty-five, for example, is the number sign (sometimes called an "octothorpe") in the United States, but a pound sign (the British currency) in Britain. As with most fields of computer endeavor, the nice thing about standards is that there are so many to choose from. Many vary only slightly--but they vary. The point is that there are a number of symbols which we commonly know, but which cannot be consistently displayed on terminals or printers. Certain terminals will have certain "international" character sets, but not all are identical. Accents and other phonetic modifiers may be difficult to handle: entire character sets are given over strictly to accented characters. (In Canada we are acutely aware of the problems, with "French" keyboards used at many sites. On one, I was having difficulty finding some necessary punctuation marks for network addressing, and asked a Francophone programmer for help. "Who knows," he growled, "I never use the ____ things!") Unicode seeks to address this problem. Including not only the variations on the Latin alphabet, Unicode incorporates Greek, Cyrillic, Hebrew and other alphabets. It also includes punctuation, diacriticals, mathematical and scientific symbols and miscellaneous graphics. Asian ideographs are also assigned codes. This is no longer suitable, of course, for a seven-bit code, and Unicode is based on a sixteen-bit address space. The book gives some background and plans (chapter one), general principles and rules for conformance (chapter two). To comment on these in any meaningful way would be to rewrite these chapters. This is technical material, though not the same technology that computer types are used to. Some background study in linguistics would be a good idea, although it is not strictly necessary to understand and use the Unicode standard. There are, however, a wealth of symbols, punctuation marks and typesetting codes which Unicode gives standardized access to. On the other hand, any application which used the standard in a significant way would likely require a linguistics background in any case. The bulk of the books (two volumes) is, of course, taken up with the actual code charts. (Volume two, in fact, is almost completely concerned with Han ideographs. In spite of the recent widespread use of the English alphabet, this is still the standard written language of Chinese, Japanese and Korean: CJK in Unicode terminology.) The charts are augmented with verbal definitions of the symbols, and with cross references to similar forms. The Unicode standard is recent. In comparative terms its current usage is negligible. However, it is the defacto standard for broadly based international character sets. With the recent rejection of the proposed ISO thirty-two bit standard, and the recasting of that standard to follow Unicode's lead, Unicode is a significant factor in the development of any international applications. copyright Robert M. Slade, 1993 BKUNICOD.RVW 980921 (Postscriptum - Unicode Inc. maintains an FTP site at unicode.org (192.195.185.2). Some of the mapping tables, and the Han cross reference lists are available. Some tables are also available on IBM PC or Mac compatible floppy disks.) http://www.unicode.org/ Permission granted to distribute only with unedited copies of TELECOM Digest and associated newsgroups/mailing lists. DECUS Canada Communications, Desktop, Education and Security group newsletters Editor and/or reviewer ROBERTS@decus.ca, RSlade@sfu.ca, Rob Slade at 1:153/733 DECUS Symposium '94, Vancouver, BC, Mar 1-3, 1994, contact: rulag@decus.ca =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= An older introductory book on this subject is "Coded Character Sets: History and Development" by C. E. MacKenzie. Reading: Addison-Wesley, 1980. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.std.internat,comp.protocols.tcp-ip Path: utkcs2!emory!samsung!cs.utexas.edu!sun-barr!decwrl!mcnc!uvaarpa!murdoch Message-ID: <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU> Date: 10 Apr 91 17:27:56 GMT References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp> Sender: usenet@murdoch.acc.Virginia.EDU Organization: University of Virginia Lines: 60 From: randall@Virginia.EDU (Randall Atkinson) Subject: Re: universality of Latin-1 John Gilmore originally wrote: % And my windows all use ISO Latin 1. If Torbj|rn would send the % umlauted letter in that standardized character set, it would look right % in both the States and in Sweden. In article <1110@sranha.sra.co.jp>, Erik M. van der Poel responded: > > Have you ever tried to send yourself a message in Latin-1? Did it > work? And even if *you* have a reasonable version of sendmail (one > that doesn't strip the 8th bit), what makes you so certain that > Torbj|rn's message and anyone else's won't pass through a site that > *does* strip the 8th bit? It does work for a fair and ever increasing subset of the Internet. BITNET doesn't do very well with it. Clearly we need to move towards 8-bit and 16-bit and 32-bit transparent mail-transport mechanisms. Fortunately there are a number of possible transport mechanisms out there to choose from, some of which are already 8-bit transparent. > Also, what's so "standardized" about ISO Latin-1? What makes it more > standard than, say, Latin-2? ISO 8859/1 is NOT any "more standard" than ISO 8859/2, however sites in the US are in fact migrating towards ISO 8859/1 from US ASCII and most sites in the US are NOT migrating towards ISO 8859/2 (though they might support it on the side as vendors begin to). The languages that are most commonly used in the US are in ISO 8859/1 and the languages supported by ISO 8859/2 are less commonly used (again in the US as a whole). Note that ISO Latin-1 is ISO 8859/1 which is the 8-bit character set used for Western European languages. ISO Latin-2 is ISO 8859/2 which is the 8-bit character set for Eastern European languages. Clearly we need to add additional information to the header of mail messages to indicate which character set to use. I'm not sure of the current state of the Internet protocols (RFC 822 et. al.) with respect to this. If there isn't the equivalent of a "Character-set:" header yet, serious consideration should be given to adding one with clearly defined values for at least existing ANSI and ISO character sets. [ARCHIVER'S NOTE: the Multipurpose Internet Mail Extensions (MIME) protocol defines character-set-selection headers for SMTP e-mail. See the Internet standards RFC1521, RFC1523, and RFC1425.] Character sets that should have a defined string to use with such a header field include at least: ASCII ISO 8859/1 ... ISO 8859/N (where N is the last defined set) ISO 10646 (once it gets completed) The Internet is the dominant mail transport network at present, partly because so many other networks gateway with it. Getting the Internet to convert to supporting such needs would be a big step in the right direction. Perhaps someone on the IETF can comment on their current activities in this area ?? Ran Atkinson randall@Virginia.EDU =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.std.internat,comp.protocols.tcp-ip Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!sun!amdcad!dgcad !dg-rtp!chutney!eliot Message-ID: <1991Apr12.124741.11555@dg-rtp.dg.com> Date: 12 Apr 91 12:47:41 GMT References: <16968@hoptoad.uucp> <1110@sranha.sra.co.jp> <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU> Organization: Data General Corporation, Research Triangle Park, NC From: eliot@chutney.rtp.dg.com (Topher Eliot) Subject: Re: universality of Latin-1 In article <1991Apr10.172756.4991@murdoch.acc.Virginia.EDU>, randall@Virginia.EDU (Randall Atkinson) writes: |> |> In article <1110@sranha.sra.co.jp>, |> Erik M. van der Poel responded: |> >Have you ever tried to send yourself a message in Latin-1? Did it |> >work? And even if *you* have a reasonable version of sendmail (one |> >that doesn't strip the 8th bit), what makes you so certain that |> >Torbj|rn's message and anyone else's won't pass through a site that |> >*does* strip the 8th bit? |> It does work for a fair and ever increasing subset of the Internet. |> BITNET doesn't do very well with it. Clearly we need to move towards |> 8-bit and 16-bit and 32-bit transparent mail transport mechanisms. I expected to see someone else post a more authoritative answer, but since none has been forthcoming, I will venture. The folks who work on such things have been considering the 8-bit, different-codeset issues, as part of a much larger picture of including such things as graphics and other binary information in mail. Since those are harder problems, it means that they won't have solutions all that quickly. There is a mailing list on this subject; if you really need it I can probaly dig out a lead on how to get onto that mailing list. |> Fortunately there are a number of possible transport mechanisms out |> there to choose from, some of which are already 8-bit transparent. Ack! "Fortunately"? There is an ancient curse: "may you live in interesting times". I think it's modern equivalent is "may you have many standards to choose from". -- Topher Eliot Data General DG/UX Internationalization (919) 248-6371 62 T. W. Alexander Dr., Research Triangle Park, NC 27709 eliot@dg-rtp.dg.com {backbone}!mcnc!rti!dg-rtp!eliot Obviously, I speak for myself, not for DG. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.misc Path: utkcs2!emory!sol.ctr.columbia.edu!spool.mu.edu!agate!sunkist.berkeley.edu Message-ID: <1991May29.000449.19048@agate.berkeley.edu> Date: 29 May 91 00:04:49 GMT References: <10599@castle.ed.ac.uk> Reply-To: raymond@math.berkeley.edu (Raymond Chen) In-Reply-To: eanv20@castle.ed.ac.uk (John Woods) From: raymond@math.berkeley.edu (Raymond Chen) Subject: Re: Name that character! (definitive list) Why does everyone feel compellet to post their favorite pronunciations? In article <10599@castle.ed.ac.uk>, eanv20@castle (John Woods) writes: >I wonder if there is a definitive list... Indeed there is. It used to be part of the comp.unix.questions Frequently Asked Questions file, but it has since moved into the `Jargon File'. Many thanks to Maarten Litmath for maintaining the USENET ASCII Pronunciation Guide for many years. (Though the list below does seem to be missing some of the cleverer names in Maarten's list. Like `Donald Duck' for `&'.) [American Standard Code for Information Interchange] /as'kee/ n. Common slang names for ASCII characters are collected here. See individual entries for , , , , , , , , , , , and . This list derives from revision 2.2 of the USENET ASCII pronunciation guide. Single characters are listed in ASCII order, character pairs are sorted in by first member. For each character, "official" names appear first, then others in order of popularity (more or less). ! exclamation point, exclamation, bang, factorial, excl, ball-bat, pling, smash, shriek, cuss, wow, hey, wham " double quote, quote, dirk, literal mark, rabbit ears # number sign, sharp, crunch, mesh, hex, hash, flash, grid, pig-pen, tictactoe, scratchmark, octothorpe, thud $ dollar sign, currency symbol, buck, cash, string (from BASIC), escape (from ), ding, big-money, cache % percent sign, percent, mod, double-oh-seven & ampersand, amper, and, address (from C), andpersand ' apostrophe, single quote, quote, prime, tick, irk, pop, spark () open/close parenthesis, left/right parenthesis, paren/thesis, lparen/rparen, parenthisey, unparenthisey, open/close round bracket, ears, so/already, wax/wane * asterisk, star, splat, wildcard, gear, dingle, mult + plus sign, plus, add, cross, intersection , comma, tail - hyphen, dash, minus sign, worm . period, dot, decimal point, radix point, point, full stop, spot / virgule, slash, stroke, slant, diagonal, solidus, over, slat : colon ; semicolon, semi <> angle brackets, brokets, left/right angle, less/greater than, read from/write to, from/into, from/toward, in/out, comesfrom/ gozinta (all from UNIX), funnel, crunch/zap, suck/blow = equal sign, equals, quadrathorp, gets, half-mesh ? question mark, query, whatmark, what, wildchar, ques, huh, hook @ at sign, at, each, vortex, whorl, whirlpool, cyclone, snail, ape, cat V vee, book [] square brackets, left/right bracket, bracket/unbracket, bra/ket, square/unsquare, U turns \ reversed virgule, backslash, bash, backslant, backwhack, backslat, escape (from UNIX), slosh. ^ circumflex, caret, uparrow, hat, chevron, sharkfin, to ("to the power of"), fang _ underscore, underline, underbar, under, score, backarrow ` grave accent, grave, backquote, left quote, open quote, backprime, unapostrophe, backspark, birk, blugle, back tick, push {} open/close brace, left/right brace, brace/unbrace, curly bracket, curly/uncurly, leftit/rytit, embrace/bracelet | vertical bar, bar, or, or-bar, v-bar, pipe, gozinta, thru, pipesinta (last four from UNIX) ~ tilde, squiggle, approx, wiggle, twiddle, swung dash, enyay Some other common usages cause odd overlaps. The ``$'', ``#'', and ``&'' chars, for example, are all pronunced `hex' in different communities because various assemblers use them as a prefix tag for hexadecimal constants (in particular, $ in the 6502 world and & on the Sinclair and some other Z80 machines). ................................................ ARCHIVER'S NOTE The jest about Donald Duck comes from the name used for this Disney character in Denmark: "Anders And". ................................................ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.std.internat Path: utkcs2!emory!att!bu.edu!wang!ice Message-ID: Date: 14 Jun 91 22:02:07 GMT References: <5565@mrmarx.UUCP> Organization: Addictive Technologies and Various Magick From: ice@wang.com (Fredrik Nyman) Subject: Re: HELP requested on internationalization sgh@mrmarx.msc.com (Satyen Harve) writes: > >I have just been given the responsibility of coming up with a >plan to internationalize our product. As a first step, I have >to identify all the issues that are involved and determine >their impact on our product. I would very much appreciate >hearing from someone who has gone through or is going through >this process. >I'd particularly like to get any tips or information on what >all is involved and where to go to read more about it. We are >hoping to address both Europe and Asian markets. I'd like to suggest that you get: "Digital Guide to Developing International Software" from Digital Press. Order # EY-F577E-DP ISBN # 1-55558-063-7 The book is geared towards the DEC platforms and the various libraries available to VMS, Ultrix and DECwindows programmers. Even if you couldn't care less about these platforms, the book is very valuable. Among other things, it describes common character sets and has quite extensive guidelines fort dealing with internationalization which are valid no matter what platform you're using. DEC can be reached at 1-800-DIGITAL if you want to order this manual. Outside the US, in New Hampshire, Alaska and Puerto Rico: 1-603-884-6660 -- Fredrik Nyman [Surgically Enhanced Cyberdweeb] DoD #0328 Global Adaptation Center, Wang, M/S 019-490, NeXT: One Industrial Ave., Lowell MA 01851, USA BITNET: =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.vms Path: utkcs2!emory!swrinde!cs.utexas.edu!sun-barr!newstop!west!texsun!smunews!txsil!danmc Message-ID: <475@txsil.lonestar.org> Date: 15 Jun 91 23:00:32 GMT References: <199113.1053.9712@canrem.uucp> Distribution: comp.os.vms Organization: Summer Institute of Linguistics, Dallas From: danmc@txsil.lonestar.org (Dan McDonald) Subject: Re: vt3xx soft fonts?? In article <199113.1053.9712@canrem.uucp> "jonathan harley" writes: > >Do you know of any available packages that provide VT3xx (or better) >downloadable soft fonts to emulate the IBM PCs graphics character set? As for ones that emulate the IBM PC'sm no, but I would probably only take a couple of hours to make it - there are only 128 characters to set up. > >If so, where might I obtain the soft fonts, how much $ etc. > I wrote a program (in DCL - my favorite programming language) that would take bitmaps in a form like: A 65 1 X 2 X X 3 X X 4 X X 5 XXXXXXX 6 X X 7 X X 8 and would convert them to the down-line loadable format. I use it mainly when I need to design another International Phonetic Alphabet softfont for someone writing a thesis around here. If you would like code and an example of how to use it, send me e-mail and I will be happy to dig it up and send it to you. ****************************************************************************** Dan McDonald * UUCP ...utafll!txsil!dalsil!mcdonald Summer Institute of Linguistics * Internet mcdonald@dallas.sil.org Dallas Computer Services * -OR- danmc@txsil.lonestar.org 7500 W Camp Wisdom Rd * SILnet DAN.MCDONALD@A1@DALLAS Dallas, TX 75236 * POTSnet (214)709-3389 USA * FAXnet (214)709-3387 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.fonts Path: cs.utk.edu!ornl!fnnews.fnal.gov!mp.cs.niu.edu!news.ecn.bgu.edu!wupost !howland.reston.ans.net!usc!elroy.jpl.nasa.gov!ames!pacbell.com!pacbell !boo!seer!ariel Summary: Hungarian alphabet is Latin alphabet Message-ID: <1993Apr22.153120.2440@seer.gentoo.com> Date: Thu, 22 Apr 1993 15:31:20 GMT References: <1993Apr21.150237.1930@wheaton.wheaton.edu> Organization: Brad Lanam, Walnut Creek, CA From: ariel@seer.gentoo.com (Cathy Hampton) Subject: Re: Hungarian Keyboard Layout The Hungarian language, or Magyar, uses the Latin alphabet. If no one here responds by tomorrow with the keyboard layout, I have it at home in one of language books, I think. (I lived in Vienna for quite a while and learned a little Hungarian.) Catherine Hampton ================================================================ Compuserve: 71601,3130 GEnie: ARIEL GEnie: AMNESTY Internet: ariel@seer.gentoo.com Internet/IGC: cah@igc.apc.org ================================================================ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Article 8620 of comp.fonts: Path: cs.utk.edu!ornl!fnnews.fnal.gov!lll-winken.llnl.gov!uwm.edu!wupost !howland.reston.ans.net!ira.uka.de!Germany.EU.net!news.netmbx.de !mailgzrz.TU-Berlin.DE!fub!spoolbag.in-berlin.de!rainbow.in-berlin.de !rainbow.in-berlin.de!not-for-mail From: rj@rainbow.in-berlin.de (Robert Joop) Newsgroups: comp.fonts Subject: Re: Latin 1 and Latin 3? Date: 24 Apr 1993 04:07:43 +0200 Lines: 68 Message-ID: <1ra7df$pg0@rainbow.in-berlin.de> References: <1993Apr22.115504.17537@news.columbia.edu> NNTP-Posting-Host: rainbow.in-berlin.de Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit pcj1@cunixf.cc.columbia.edu (Pierre Jelenc) writes: >I am looking for the assignments of characters to bytes in the Latin 1 >and Latin 3 character sets. In particular, I am concerned with the >discrepancies between the tables found in DOS and windows manuals and the >actual Latin 1 character set, and with the differences between Latin 1 and >Latin 3. from rfc1345 (Character Mnemonics & Character Sets): [...] &charset ISO_8859-1:1987 &rem source: ECMA registry &alias iso-ir-100 &g1esc x2d41 &g2esc x2e41 &g3esc x2f41 &alias ISO_8859-1 &alias ISO-8859-1 &alias latin1 &alias l1 &alias IBM819 &alias CP819 &code 0 NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _ '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3 DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC NS !I Ct Pd Cu Ye BB SE ': Co -a << NO -- Rg '- DG +- 2S 3S '' My PI .M ', 1S -o >> 14 12 34 ?I A! A' A> A? A: AA AE C, E! E' E> E: I! I' I> I: D- N? O! O' O> O? O: *X O/ U! U' U> U: Y' TH ss a! a' a> a? a: aa ae c, e! e' e> e: i! i' i> i: d- n? o! o' o> o? o: -: o/ u! u' u> u: y' th y: [...] &charset ISO_8859-3:1988 &rem source: ECMA registry &alias iso-ir-109 &g1esc x2d43 &g2esc x2e43 &g3esc x2f43 &alias ISO_8859-3 &alias ISO-8859-3 &alias latin3 &alias l3 &code 0 NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _ '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT PA HO BH NH IN NL SA ES HS HJ VS PD PU RI S2 S3 DC P1 P2 TS CC MW SG EG SS GC SC CI ST OC PM AC NS H/ '( Pd Cu ?? H> SE ': I. S, G( J> -- ?? Z. DG h/ 2S 3S '' My h> .M ', i. s, g( j> 12 ?? z. A! A' A> ?? A: C. C> C, E! E' E> E: I! I' I> I: ?? N? O! O' O> G. O: *X G> U! U' U> U: U( S> ss a! a' a> ?? a: c. c> c, e! e' e> e: i! i' i> i: ?? n? o! o' o> g. o: -: g> u! u' u> u: u( s> '. [...] the mnemonics are explained in the rfc. rfc's can be found on many ftp sites. rj -- __________________________________________________ Robert Joop rj@{rainbow.in-berlin,fokus.gmd,cs.tu-berlin}.de s=joop;ou=fokus;ou=berlin;p=gmd;a=dbp;c=de =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: bit.listserv.win3-l Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU Path: cs.utk.edu!darwin.sura.net!newsserver.jvnc.net!news.cac.psu.edu!psuvm !auvm!LUCS-01.NOVELL.LEEDS.AC.UK!ECL6TAM Return-Path: <@AUVM.AMERICAN.EDU,@VTBIT.BITNET:WIN3-L@UICVM.BITNET> Via: UK.AC.LEEDS.GPS; 2 JUL 93 8:56:26 BST Message-ID: Date: Fri, 2 Jul 1993 08:54:23 GMT Reply-To: T.A.McAllister@mailer.leeds.ac.uk Sender: Microsoft Windows Version 3 Forum From: Alec McAllister Subject: Re: Foreign language keyboards (German) Apologies if you already seen this. It was returned, implying that it had never reached the list. >Date: Thu, 1 Jul 1993 16:52:25 GMT >From: Alec McAllister >Subject: Re: Foreign language keyboards (German) > >>Date: Thu, 1 Jul 1993 10:16:50 -0500 >>From: Brian Madsen >>Subject: Foreign language keyboards (German) >> >>I occasionally use Windows for writing in German, and when I do, I switch >>the keyboard definition from US to German. This makes it lots easier to >>get at German foreign language characters (double ss's, umlauts, etc.) >> > >There's a better way. There's a piece of shareware called WinGreek. >That includes a program called Beta which "watches" your keyboard and >substitutes accented characters if you type certain combinations of >keys, e.g. if you type u followed by the plus-key on the numeric >keypad, Beta substitutes ANSI character 0252, u-umlaut. Similarly, >typing A followed by the plus-key makes Beta substitute ANSI 0196, A- >umlaut. The accents used in French, Spanish etc are just as quick and >easy to obtain. > >WinGreek and Beta work with any Windows product, not just word >processors. > >Beta plus a single font, the Times New Roman that comes with Windows, >can produce text in every major European language except Welsh (there >are no w-circumflex or y-circumflex characters). > >The beauty of this system is that you only have to learn one set of >special keys for all the languages: >/ = acute, >* = grave, >- = circumflex, >+ = umlaut, >tilde = tilde (Hurray!) and >the vertical gapped line = everything else (e.g. s followed by that >character gives you the German SZ that looks like a capital B, but A >followed by that character gives you the A with a ring above it which >is used in Scandinavian languages). > >WinGreek also gives you a superb Greek font with all the accents and >breathing-marks, a Hebrew font with (limited) right-to-left >processing, and even a font for Coptic. > >WinGreek is on archives such as CICA, but the authors are on email. I >can send their address if anyone is interested. > >. Alec McAllister Arts Computing Development Officer Computing Service University of Leeds LS2 9JT tel 0532 335399 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Path: cs.utk.edu!gatech!howland.reston.ans.net!Germany.EU.net!news.dfn.de !news.rwth-aachen.de!urmel.informatik.rwth-aachen.de!fangorn!michael Date: Tue, 10 Jan 95 19:06:02 MET Organization: An old and gray machine, somewhere in Moria. Message-ID: <9501103491@fangorn> References: <3ekfe6$3ed@news1.shell> NNTP-Posting-Host: akela.informatik.rwth-aachen.de From: Michael Haardt Subject: Re: What is a lantern symbol... kshaw@shell.portal.com (kendall thomason shaw) writes: > My question > is what similar symbols might there be in McDOS code pages 850 or 437 > for the following symbols: > > lantern symbol > checker board (stipple) > board of squares > scan line 1 > scan line 9 > plus I don't know about DOS, but the characters look as following: checker board: # # # # # # # # # # # # # # # # # # # # # # # scan line 1 is a horizontal line at the top of a character, scan line 9 is a horizontal line at the bottom of a character. A vt100 has various such horizontal lines. plus is indeed a big cross, like used in conjuction with the corner and line symbols. lantern and board of squares I can not tell you right now, my vt100 is at home. It may be that it does not have them, at least the wyse 60 I am using does not have those in its emulation. The mapping characters are very closely connected to the vt100 and the AT&T4410. > And then I am still (of course?) baffled by the acsc/ac capability > syntax. Am I to put an octal escape for the literal character there? > (after the corresponding character expected, e.g. \305 for center line > drawing criss-cross type symbol? Yes, indeed you can do it that way. I used it a few years ago with Minix. ac=n\305 would map n to such a cross for native PC fonts. Michael -- Twiggs and root are a wonderful tree (tm) Twiggs & root 1992 :-) d? H- s(+)/(-) g! au a- w v(---) C++(+++) UL++++S++++?++++ L++ 3 E- N+++ tv b+ e+ h f+ m@ r++ n@ y+ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.unix.programmer,comp.terminals Path: cs.utk.edu!gatech!howland.reston.ans.net!pipex!sunic!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Followup-To: comp.terminals Date: 16 Jan 1995 13:20:24 GMT Organization: Finnish Meteorological Institute (FMI) Lines: 36 Message-ID: <3fdrqo$ca4@kronos.fmi.fi> References: NNTP-Posting-Host: dionysos.fmi.fi In-Reply-To: Article of Ryan Groth From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: ASCII CODES > 127 under VT100/ANSI & CURSES [ Folloups to comp.terminals ] snakec@larry.wyvern.com (Ryan Groth) writes comp.unix.programmer: | |I am writing a few application under SCO unix (AT&T System V, POSIX...) |using curses. I would like to use line drawing characters in the application |which I am positive my terminal supports. I do not want to use the box() |function however. If I addstr() with line drawing characters in the string I |get M's and D's on the screen. Box does draw lines. Is there a way to use |addstr() and send line characters? I am positive that my application will These line drawing characters are from different character set: Usual assingment (with curses and VT100) may be: Bank G0 US-ASCII Assigned with ESC ( B Bank G1 Special Graphics Assigned with ESC ) 0 Selecting bank G0 for characters 32-127 with SI Selecting bank G1 for characters 32-127 with SO ESC is 0x1B or Ctrl-[ SI is 0x0F or Ctrl-O SO is 0x0E or Ctrl-N As you can see drawing of line characters don't be so simple (VT100 DON'T use characters > 127 -- VT100 don't support them). You can't do it with addstr() only, because task includes charcter set assigments also. (If terminal supports 8-bit characters you perhaps can assing Special Graphics to bank G1 and select bank G1 for characters 128-255 with ESC ~ I however don't be sure that this Special Graphics characters are duplicated to upper range -- perhaps they are. ) -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@Fmi.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.fmi.fi =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.lang.cobol, comp.terminals, comp.unix.aix, comp.periphs Date: 5 Sep 1996 14:46:07 -0400 From: Richard Shuford Subject: Re: terminfo files for AIX 2.3 In article <322EB2E4.594F@lincsys.com>, Jim Egerton writes: > > Anyone have terminfo files (xterm, vt100, aixterm) that work > with the Microfocus toolbox on AIX? > > Using the files shipped with Microfocus V.3.2.37 I have tried > using an aixterm as well an xterm with TERM set to xterm and > vt100. With the aixterm or xterm and TERM=xterm, the video > didn't work properly (line's were displayed as qqqqqq). The display of a row of "qqqqqqq..." is a symptom of the client application wanting to use the DEC Line-Drawing Character Set, which is built into VT100s, VT320s, and any other DEC-like terminal built since 1980. With the proper character set mapped into the "alternate" character set, and if the terminal (or emulation) properly honors codeset switching, a horizontal line is displayed, instead of "qqqqqqqq...". (By the way, this is *not* the same as DEC's "advanced video option", or AVO. AVO on a VT100 gave you 24-line-by-132-column mode and the full four video attributes: underline, reverse, bold, & blink. Later DEC terminals had support for this as standard.) > With an xterm and TERM=vt100, the video is great (appears to use > the vt100 graphics character set to draw frames), but the > function keys didn't work. You don't say what kind of keyboard you are using. Makes a difference. > After copying the terminfo files to a local directory, > pointing COBTERMINFO and TERMINFO at the local directory, and > running the .src files through tic, the situation improved > slightly. The video for the aixterm and xterm with > TERM=xterm is better, but frames are drawn using +---+ > instead of the vt100 graphics characters. A reasonable thing to do, if the client cannot be certain that your xterm emulation supports the line-drawing characters. > I was able to modify the kf1 settings in vt100.src so that the function > keys are recognized, but the frames are drawn the same as > with the aixterm and xterm with TERM=xterm. > > I also pulled the example vt100 file from the Microfocus > Cobol home page and tried using this with an xterm. Same > results--no advanced video. > > If anyone has any terminfo files that appear to work in this > environment, or online documentation for the settings of sgr, > sgr0, enacs, rmacs, and acsc I'd really appreciate it. The global master database for terminfo and termcap descriptions is now maintained by Eric S. Raymond and is available from: http://www.ccil.org/~esr/ncurses.html ........................................ Addendum: the master terminfo/termcap files contain a "klone+acs" entry that tries to use the line-drawing characters from the IBM PC alternate character set. This might work with any Intel console. ........................................ =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news-res.gsl.net !news.gsl.net!news.mathworks.com!newsfeed.internetmci.com!demos !news.uni-stuttgart.de!uniol!uni-erlangen.de!lrz-muenchen.de !news.rz.uni-passau.de! Message-ID: <32102731.87@fmi.uni-passau.de> Organization: University of Passau, Germany Date: Tue, 13 Aug 1996 08:56:49 +0200 From: Martin Ramsch To: Mike Ching X-Mailer: Mozilla 3.0b5 (X11; I; SunOS 5.5 sun4u) References: NNTP-Posting-Host: 132.231.20.18 Lines: 35 Subject: Re: I want lines, not q's! Mike Ching wrote: > > I'm trying to write a VT-100/ANSI terminal emulator in QuickBasic, but > I'm getting a bunch of > > qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq > > where there are supposed to be horizontal lines. How am I supposed to > recognize when there are supposed to be lines instead of q's? I've > noticed even some commercial programs with the same problem. I don't know exactly about VT-100/ANSI, but xterm's behaviour should be quite similiar (BTW, what are the differences?). What you observe is the switching between charsets: Control-N (SO, Shift Out): Switch to Alternate Charater Set: invokes the G1 character set Control-O (SI, Shift In): Switch to Standard Character Set: invokes the G0 character set (the default) To character sets G0 and G1 actually refer is controlled by ESC ( : Designate G0 Character Set ESC ( B = Unites States (USASCII) ESC ( 0 = DEC Special Character and Line Drawing Set ESC ) : Designate G1 Character Set ESC ) B = Unites States (USASCII) ESC ) 0 = DEC Special Character and Line Drawing Set I guess as default G0 should refer to USASCII and G1 to the Line Drawing Set. So, in a nutshell, you have to pay attention to these code sequences! See and -- Sincerly/Mit freundlichen Gruessen Martin Ramsch Inbox/Fax: 02561/91371-6364 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development,comp.terminals Followup-To: comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!swrinde!pipex!sunic !sunic.sunet.se!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Date: 7 Apr 1995 07:21:53 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <3m2p6h$kll@kronos.fmi.fi> In-Reply-To: Article <3bjdl0$lfd@nyx10.cs.du.edu> of Colin Plumb References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) [ This is comment to very old article from my archive :-) ] colin@nyx10.cs.du.edu (Colin Plumb) writes in comp.terminals: |I just went through RFC 1345 and the CCITT Red Book Recommendation T.51. |It seems that the standard escape sequence looks like: |CSI P P P ... P I...I F |Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?) |I are magic modifier flags that can totally change the meaning of the escape |sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./) |And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies |what the escape sequence is all about. |The parameters P are decimal numbers separated by semicolons in the usual |way. An all-zero field is synonymous with an empty field. Trailing empty |fields and the separating semicolons can be stripped. Using a colon (:) |is reserved for future standardizatoin. If the parameters start with any |of 0x3C..0x3F (<=>?), it's private-use. |The top bit is ignored if set, although it's not supposed to be, in all |the arguments. |(That is taken from ISO 6429. It also says that F in the range of 0x70..0x7E |is not to be standardized, but is for experimental use.) |This applies to CSI, also known as ESC [. However, some of the ESC sequences |described below also seem to use a similar pattern, although the last |group of final characters isn't reserved and none of the sequences discussed |here have parameters. |As I understand it, you have two control sets available, C0 and C1. |Characters from 0..0x1F are in C0, and 0x80..0x9F are in C1. In case you |can't send 8-bit characters, ESC-@ through ESC-_ are synonyms for |128 through 159. (ESC-x means x+64, for 64 <= x < 96.) |You can select a C0 set with ESC ! F, where F is one of the final |characters discussed above, and a C1 set with ESC " F. |There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). |You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F |and 0xA0..0xFF ranges are available to have these sets mapped into them. |When you see a "0x3F", for example, you have to figure out which set (G0, |G1, G2 or G3) is mapped into that space, and then figure out which character |set is in force there. |It's a bit like the 4 segment registers on the 8086. |94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F. |These are the G0..G3 slots, respectively. There's also an overflow range |which is used, ESC ( ! F, etc. |96-character sets can only be mapped to the G1..G3 slots. That uses |ESC - F, ESC . F and ESC / F. The "F" assignments are independent of |the assignments for the 94-character sets. |I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in |0xA0..0xFF, but I'm not finding it documented. |Anyway, you can then choose the mapping of bytes to graphic character |sets. This is done with LS0, LS1, LS2 and LS3 (locking Shift N) |to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for |the 0xA0..0xFF range. There's also SS2 and SS3 to shift the next character |from G2 or G3 into the 0x20..0x7F range. |In the document I have, SS2 ix 0x19 (EM) and SS3 is 0x1D (GS). |LS0 is 0x0F (SI), and LS1 is 0x0E (SO). LS2 is ESC n and LS3 is |ESC o. LS1R is ESC ~, LS2R is ESC } and LS3R is ESC |. |There are also multi-byte character sets, using either 94 or 96 |characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F |for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for |the 960-character case. |You can have "dynamically reconfigurable character sets" (downloadable fonts), |which are specified by inserting a space (0x20) between the character-set |specifier and the final character. (If 63 is not enough, overflow using |the ! hack is a possibility.) |Oh, and finally, you can replace everything (all 128 or 256 characters) |with ESC % F. What happens after that depends on the new character set, |which may or may not define ESC to get at the old things. |Now, what I don't understand is how 8-bit character sets work. RFC 1345 |specifies rather a lot of them, and generally uses the 96-character escapes |for them, but there are a few 94-character escapes specified. |In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2 |character sets, which are 8 bits. |I could reconcile this if the G sets had room for two banks of characters |(low and high), and 7-bit sets loaded both identically, while 8-bit |sets loaded them differently, and the various shift functions fetched |from the corresponding bank. But I can't find it referred to anywhere. Seems that in 94-banks really are only 94-charcters and 96-banks have only 96 characters. In case on 8-bit characters in banks have characters 161-254 (94-bank) or 160-255 (96-bank). So after what bank is selected higest bit of char is ignored. That higgest bit affect only selection of GR/GL. And selection of GR/GL affect is that bank G0-G3. But after that caharcter is indexed from bank as (char & 127) -- or this is my impression from some documents (specially from: draft-ohta-text-encoding-01.txt). Can you comfirm this? |Anyway, I don't think I've made any suggestions or asked any questions, |but maybe this information dump will help some other people. |-- | -Colin [ CC'ed to colin@nyx10.cs.du.edu ] -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@FMI.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.FMI.FI =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!news.alpha.net !news.mathworks.com!transfer.stratus.com!xylogics.com!Xylogics.COM!carlson Date: 7 Apr 1995 12:08:07 GMT Organization: Xylogics Incorporated Message-ID: <3m39v7$2es@newhub.xylogics.com> References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> <3m2p6h$kll@kronos.fmi.fi> NNTP-Posting-Host: newhub.xylogics.com From: carlson@Xylogics.COM (James Carlson) Subject: Re: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) In article <3m2p6h$kll@kronos.fmi.fi>, hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: [...] |> |You can select a C0 set with ESC ! F, where F is one of the final |> |characters discussed above, and a C1 set with ESC " F. Do you have a reference for that? I've never seen those described or used. (I'm not even sure what it would mean to have a "C0 set" ...) |> |Thereare 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). |> |You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F |> |and 0xA0..0xFF ranges are available to have these sets mapped into them. |> |When you see a "0x3F", for example, you have to figure out which set (G0, |> |G1, G2 or G3) is mapped into that space,and then figure out which character |> |set is in force there. You left out GL and GR. GL (Graphics Left) is the pointer which maps the 20-7E characters into one of the Gx sets. Thus, GL has one of the values 0, 1, 2 or 3. GR (Graphics Right) is the pointer for the A0-FF set. This is usually restricted to 1, 2 or 3 (not 0). |> |I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in |> |0xA0..0xFF, but I'm not finding it documented. The default (at least for VT-series terminals) is GL=0, GR=2, G0=ascii, G1=ascii, G2=multinational and G3=multinational. |> |Anyway, you can then choose the mapping of bytes to graphic character |> |sets. This is done with LS0, LS1, LS2 and LS3 (locking Shift N) |> |to place G0..G3 in the 0x20..0x7F range, and LS1R, LS2R and LS3R for |> |the 0xA0..0xFF range. There's also SS2 and SS3 to shift the next character |> |from G2 or G3 into the 0x20..0x7F range. Actually, the locking-shift operators just change the GL and GR pointers. --- James Carlson Tel: +1 617 272 8140 Annex Software Support / Xylogics, Inc. +1 800 225 3317 53 Third Avenue / Burlington MA 01803-4491 Fax: +1 617 272 2618 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Article 3934 of comp.terminals: Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!atglab.bls.com!gatech!newsjunkie.ans.net!newstf01.news.aol.com!newsbf02.news.aol.com!not-for-mail From: psichel@aol.com (PSichel) Newsgroups: comp.terminals Subject: Re: 8-bit charset in C1-C3 banks (Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m - was: Re: using the V ...) Date: 13 Apr 1995 11:07:03 -0400 Organization: America Online, Inc. (1-800-827-6364) Lines: 23 Sender: root@newsbf02.news.aol.com Message-ID: <3mjemn$5j5@newsbf02.news.aol.com> References: <3m2p6h$kll@kronos.fmi.fi> Reply-To: psichel@aol.com (PSichel) NNTP-Posting-Host: newsbf02.mail.aol.com In Message-ID: <3m2p6h$kll@kronos.fmi.fi> you wrote: >Now, what I don't understand is how 8-bit character sets work. 8-bit character sets that follow the ISO structure (ISO 2022) are made up of two 7-bit "halves". For example, ASCII in GL and ISO Latin-1 Supplemental in GR. The combined 8-bit set is called "ISO Latin Alphabet Nr 1" or "ISO Latin-1" for short. [Ignoring the control sets C0 & C1 for simplicity] ISO 8859/1 (Latin-1) through ISO 8859/9 define additional 8-bit sets by specifying the supplemental part to be used in GR along with ASCII in GL. IBM Code Pages are different in that they have no structure for designating and invoking (switching) character sets or components. Each code page defines a fixed application specific repertiore. The term "code page" refers to the page number on which the character set is described in IBM's master book of character encodings. - Peter =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Article 2632 of comp.protocols.kermit.misc: Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!cssun.mathcs.emory.edu!hobbes.cc.uga.edu!news-feed-1.peachnet.edu!news.netins.net!newshost.marcam.com!uunet!psinntp!nntp.hk.super.net!news.ust.hk!apang From: apang@cs.ust.hk (Albert PANG) Subject: How To read/write Chinese at a remote host using UNIX C-Kermit Message-ID: <1995Apr24.142214.28377@uxmail.ust.hk> Sender: usenet@uxmail.ust.hk Nntp-Posting-Host: cssu81.cs.ust.hk Organization: The Hong Kong University of Science and Technology X-Newsreader: TIN [version 1.2 PL2] Date: Mon, 24 Apr 1995 14:22:14 GMT Lines: 84 How to read/write Chinese at a remote host using UNIX C-Kermit ============================================================== Software required: ----------------- 1) cxterm 2) kermit 'cxterm' is available at anonymous ftp ftp://cs.purdue.edu:/pub/ygz/cxterm-??.??.??.tar.Z Linuxers can also get a binary version on Linux at ftp://sunsite.unc.edu:/pub/Linux/X11/xutils/terms/cxterm-??.tar.gz C-Kermit 5A for your version of UNIX is available from ftp://kermit.columbia.edu/kermit/archives/cku190.tar.{Z,gz} Setup procedure: ---------------- 1. Make sure you have cxterm properly installed and can display/write Chinese characters in your local host. To get cxterm properly installed, the FAQ for cxterm, which is available at anonymous ftp: cs.purdue.edu:/pub/ygz/CXTERM.FAQ will be helpful. There are currently a few encoding methods for Chinese characters. They are Big5, GB and HZ. In HK and Taiwan, Big5 is more popular and in Mainland China, GB and HZ are more popular. 'cxterm' can be configured to support all of them. Anyway, this will not be relevant to kermit, as long as they are 8-bit code. 'cxterm' configured to a particular encoding will recognize that encoding only. 2. Open a cxterm and run kermit. 3. Configure kermit. Before you connect your modem to kermit, you need some parameter settings: set parity none set command bytesize 8 set terminal bytesize 8 set terminal character-set transparent Then connect as usual and log in to your remote host. 4. At your remote host, set the terminal to allow 8-bit character by UNIX-Prompt> stty pass8 This example works on SunOS, but the syntax might differ for other UNIX systems, for example "stty cs8" or "stty -parity". On non-UNIX systems use the appropriate command (like "set terminal /eightbit" on VMS). If you don't do this, you can still read Chinese, but you can't type, since your terminal will truncate the highest bit of your code. (unless of course, your terminal has already been configured) You might like to include the above line in your shell rc script, so that you won't have to type it in every time you log in. 5. Voila! You should now be able to read/write Chinese in your cxterm. Go get a cup of tea or something and try read some Chinese newsgroups. alt.chinese.txt.big5 alt.chinese.txt tw.bbs.talk.joke Make sure you have the right kind of cxterm. cxterm configured to read Big5 will not recognize a passage written in GB, and vice versa. And for information about how to read/write Chinese using MS-DOS Kermit, see "Circumnavigating the Web" in Kermit News #6: ftp://kermit.columbia.edu/kermit/e/newsn6.{txt,ps} http://www.columbia.edu/kermit/newsn6.html -- Albert Pang =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Kterm Announcement Sat May 4 14:11:37 1991 Internet: mleisher@nmsu.edu Bitnet : mleisher@nmsu.bitnet Mark Leisher Computing Research Lab New Mexico State University Box 3CRL Las Cruces, NM 88001-0001 +1 505 646-5711 INTRODUCTION ------------ Kterm is a modified version of xterm that is capable of displaying text from character sets requiring 2-bytes per character as well as the standard single byte character sets. The original kterm was designed to support display of Japanese text. This capability has been expanded to include Chinese and Korean as well. CHARACTER SETS AND CODINGS -------------------------- Version 4.1.2 of kterm can display Chinese, Japanese, and Korean text in a number of coding systems. With the exception of the Korean N-byte coding, all of the coding systems described below require two bytes per character. 1. Chinese A. GB2312-1980 (GuoBiao) PRC standard GB is a seven bit standard that requires two bytes per character. It is most often used with the high (most significant) bit set on each byte of the character to distinguish the Chinese text from other seven bit text. The eight bit usage of GB is also used in CCDOS, the Chinese version of MS-DOS. NOTE: Perhaps the eight bit usage should be refered to as EUC (Extended Unix Code). CODE RANGE: 0xA1A1-0xFEFE B. Shift-GB Shift-GB is a mixed seven and eight bit coding, with the first byte always having the high (most significant) bit set to distinguish it from other seven bit text. Shift-GB was used by the Chinese Macintosh OS until recently. NOTE: I'm not sure if it is an official standard. CODE RANGE: 0x8140-0xAFFC (excluding 0x7F as a second byte) C. Big5 Big5 is a mixed seven and eight bit coding, with the first byte always having the high (most significant) bit set to distinguish it from the other seven bit text. Big5 is at least a de facto standard in places like Hong Kong and Taiwan where the Traditional Chinese ideographs are used. NOTE: Rumor has it that it is, or will be a standard in Taiwan. I don't have any facts on this yet. CODE RANGE: 0xA140-0xF9FE 2. Japanese A. JIS (Japanese Industrial Standard X0208-1983) JIS is a seven bit standard that is usually distinguished from other seven bit text by a starting and ending escape sequence. START ESCAPE SEQUENCE: $B (NEW-JIS) @B (OLD-JIS) END ESCAPE SEQUENCE : (B CODE RANGE: 0x2121-0x7E7E B. Shift-JIS Shift-JIS is a mixed seven and eight bit coding, with the high (most significant) bit of the first byte set to distinguish it from the other seven bit text. CODE RANGE: FIRST BYTE : 0x81-0x9F and 0xE0-0xEF SECOND BYTE: 0x40-0xFC (excluding 0x7F) C. EUC EUC is an eight bit usage of JIS, with the high (most significant) bit of each byte set to distinguish it from other seven bit text. CODE RANGE: 0xA1A1-0xFEFE 3. Korean A. KSC5601-1987 (Jamos and Hangul) This version of kterm only supports the Jamos (Hangul elements) and Hangul portion of the KSC5601-1987 standard. The Hanja portion will come later. KS is a seven bit standard that requires two bytes per Hangul character. It is most often used with the high (most significant) bit set on each byte of the character to distinguish the Korean text from other seven bit text. NOTE: Perhaps the eight bit usage should be refered to as EUC (Extended Unix Code). CODE RANGE: JAMOS : 0xA4A1-0xA4FE HANGUL: 0xB0A1-0xC8FE B. N-byte N-byte code is a way of representing Hangul text using only ASCII characters. It uses a variable number of bytes to select a particular Hangul syllable and is distinguished from other seven bit text by the SO (Shift Out) sequence and the SI (Shift In) sequence. START ESCAPE SEQUENCE: ^N (0x0E) END ESCAPE SEQUENCE : ^O (0x0F) CODE RANGE: 0x41-0x7C (full range) NOTE: The code range actually varies. See the file "hgutil.c" for details. 4. X11 Compound Text Version 4.1.2 of kterm now recognizes most of the Compound Text approved standard encodings. It does not recognize the non-standard character set encodings or the directionality indicators. Even though the approved standard encodings are recognized, this is no guarantee that they will display text appropriately, specifically the right-to-left encodings. Code will have to be added to support this. The 94^N Compound Text sequences for GB 2312-1980, JIS X0208-1983, and KS C5601-1987 will be interpreted correctly if the appropriate language is chosen when starting kterm, or if it is set in the application defaults file, KTerm.ad. FONTS ----- There are a number of freely available Chinese, Japanese and Korean X11 fonts available. Here are some anonymous ftp sites where the fonts are available: 1. HOST: crl.nmsu.edu [128.123.1.14] CRL has a relatively complete collection of the freely available Chinese, Japanese, and Korean X11 fonts. They are located in the subdirectories pub/chinese/fonts, pub/japanese/fonts, and pub/korean/. The CRL site also has lists of known anonymous ftp sites for software related to the language of interest. 2. HOST: miki.cs.titech.ac.jp [131.112.16.39] HOST: utsun.s.u-tokyo.ac.jp [133.11.11.11] These ftp sites have large collections of many Usenet and JUNET newsgroup archives. The fj.sources archives contain many of the Japanese X11 fonts that have been posted on JUNET. There are Index files in most of the directories describing which archive file has the font sources. 3. HOST: kum.kaist.ac.kr [137.68.1.65] There are a few Korean utilities available from this site as well as archives of a number of Usenet news groups. Most of the Korean related code and fonts are located in pub/hangul/. AUTHORS AND CONTRIBUTORS ------------------------ The initial conversion work on xterm for displaying Japanese text was done by kagotani@cs.titech.ac.jp (Hiroto Kagotani). The ANSI color support was added using the kterm 4.1.0 patches provided by mukawa@tn-sec.ntt.junet (Susumu Mukawa). The Multi-Byte Character Set Word Select feature was added using a modified version of Kiyoshi KANAZAWA's 4.1.0 MBCS_WSEL patches. The Chinese and Korean support was added by mleisher@nmsu.edu (Mark Leisher). CLOSING NOTES ------------- The {character set,font set,language,conversion} mechanisms are a little clumsy and should eventually be modified to be more in line with XPG3 locale specifications and the up-coming X11 i18n specifications. Hopefully, this won't be too far away. BUG REPORTS ----------- Please send bug reports and/or fixes for kterm 4.1.2 to mleisher@nmsu.edu or mleisher@nmsu.bitnet. THANKS ------ I would like to express my thanks to Mr. Kagotani for doing the initial conversion work. His code made it a lot easier for me to add support for Chinese and Korean. Thanks go to Ricky Yeung and F. F. Lee for making their Chinese code conversion programs freely available. I would also like to thank ujsung@solgai.kaist.ac.kr (UnJae Sung) for having the patience to answer my questions about Korean coding. And last but not least, thanks go to these people for significant bug reports and fixes: John Melby of Fujitsu Martin C. Fong of Sybase Yang Zhiwei of the German National Research Center for Computer Science Alton Harkcom (for help updating the Japanese manual page) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu !news-peer.sprintlink.net!news.sprintlink.net!Sprint!newsfeed.nacamar.de !wuff.mayn.de!wuff.franken.de!news-nue1.dfn.de!news-mue1.dfn.de !rzg.mpg.de!lrz-muenchen.de!not-for-mail Newsgroups: comp.unix.questions,comp.unix.admin,comp.windows.x, comp.std.internat,comp.software.international,at.general, soc.culture.german,soc.culture.french,soc.culture.belgium, soc.culture.quebec,soc.culture.nordic,soc.culture.spain, soc.culture.portuguese,soc.culture.latin-american, soc.culture.brazil,soc.culture.argentina,soc.culture.mexico, soc.culture.italian,soc.culture.colombia,soc.culture.venezuela, soc.culture.peru,soc.culture.chile,bit.listserv.catala Distribution: world References: Message-ID: <5r028v$4fn$1@sparcserver.lrz-muenchen.de> Organization: Leibniz-Rechenzentrum, Muenchen (Germany) Date: 21 Jul 1997 16:20:47 GMT From: Helmut.Richter@lrz-muenchen.de (Helmut Richter) Subject: Re: ISO 8859-1 National Character Set FAQ mike@vlsivie.tuwien.ac.at writes: >*****If you can confirm or deny this, please let me know.***** >Currently, each system vendor has his own set of locale names, which >makes portability a bit problematic. Supposedly there is some X/Open >document specifying a > _. >syntax for environment variables specifying a locale, but I'm unable >to confirm this. POSIX 1003.1 recommends (in the informative annex E.1.3) to use the following syntax of locale names: language_TERRITORY.Code, e.g.: de_AT.ISO8859-1 hu_HU.ISO8859-2 ja_JP.AJEC The funny thing is that they use a different syntax in the example in section B.8.1.2 (also an informative annex). ==== I think one should add some info on redefining a keyboard under X11 as to include additional characters. I have written a lengthy paper on the topic, albeit in German language (http://www.lrz-muenchen.de/services/software/x11/xmodmap/). I am ready to translate a part of it into English, but certainly not all of it. This is also interesting for emacs under X11: emacs does make a difference between a key combination like Meta-d and a key combination that has been redefined to mean a non-ASCII character (of course you must not use the Meta key, which is typically the same as the Alt key, as Mode_switch key). It is thus not necessary to quote such characters with Ctrl-Q to prevent them from being taken for emacs commands. Helmut Richter =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development.apps, comp.os.linux.development.system,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Organization: Finnish Meteorological Institute (FMI) Lines: 53 Message-ID: <3pn802$sc1@kronos.fmi.fi> References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> In-Reply-To: Article <3p9gne$mu7@uahcs2.cs.uah.edu> of Chris Ford NNTP-Posting-Host: dionysos.fmi.fi Date: 21 May 1995 11:25:54 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Character set assigments (Re: How do I display IBM PC characters?) [ Added comp.os.linux.developments.system as receiver because terminal driver is part of kernel -- right? Added comp.terminals as receiver, because that is terminal (or terminal emulation) issue. ] cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: | |Peter Koenig (koenig@interaccess) wrote: |: I'm trying to figure out how to display IBM PC characters. I know it's |: possible, but doing a simple printf() with the value gets it masked to |: 7-bits, and when I tried ncurses, it put the wrong character up... Any |: pointers to more info on this? | Before you do your printf, print this: "\033(U" and it will switch |to the DOS character set. "\033(B" will switch back. Or vice versa. Just a comment (and some surprising notes :-)) These ESC ( U is quite odd code in standards view as far I understands. ESC ( assigns bank G0. And bank G0 is on accessible in range 128-255. That is GR (right side; characters (128)160-255) can newer point to to bank G0. Only to banks G1-G3. It should be more understandable if code is ESC - A Assing Latin/1 (area (128)160-255) to G1 ESC - U Assign DOS character set (area 128-255) to G1 But it isn't that way :-) And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another standard meaning (see later). (both ESC - and ESC ) assigns G1 -- charset names are different. Hmm. ESC - can assign areas 160-255 (32-127), ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.) By to way -- from where that ident "U" comes for DOS character set? Just curious. Oops. Letter "U" is reserved for Latin-greek-1 (iso-ir-27) according of RFC 1345 (that is informal RFC). RFC 1345 lists following codes: ESC ( U Assigns iso-ir-27 to G0 ESC ) U Assigns iso-ir-27 to G1 ESC * U Assigns iso-ir-27 to G2 ESC + U Assigns iso-ir-27 to G3 RFC 1345 don't list codes ESC - U Assign {something} (160-255 (32-127)) to G1 ESC . U Assign {something} (160-255 (32-127)) to G2 ESC / U Assign {something} (160-255 (32-127)) to G3 [ Hmm. Perhaps I comment some other issues later. ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development.system, comp.os.linux.development,,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net !demon!doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3pnf1l$nc@kronos.fmi.fi> In-Reply-To: Article <3pn802$sc1@kronos.fmi.fi> of "Kari E. Hurtta" References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> <3pn802$sc1@kronos.fmi.fi> NNTP-Posting-Host: dionysos.fmi.fi Organization: Finnish Meteorological Institute (FMI) Date: 21 May 1995 13:26:13 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: Character set assigments (Re: How do I display IBM PC characters?) hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: [ I promised to followup myself :-) ] |[ Added comp.os.linux.developments.system as receiver because terminal driver | is part of kernel -- right? Added comp.terminals as receiver, because that | is terminal (or terminal emulation) issue. ] [ Dropped comp.os.linux.development.apps from receivers. Added comp.os.linux.development as receiver :-) ] |cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: || Before you do your printf, print this: "\033(U" and it will switch ||to the DOS character set. "\033(B" will switch back. Or vice versa. |These ESC ( U is guite odd code in standards view as |far I understands. ESC ( assigns bank G0. And bank G0 is on accessible |in range 128-255. That is GR (right side; characters (128)160-255) can newer |point to to bank G0. Only to banks G1-G3. |It should be more understandable if code is | ESC - A Assing Latin/1 (area (128)160-255) to G1 | ESC - U Assign DOS character set (area 128-255) to G1 Because you want keep DEC special graphics in G1 (which is default for VT100), and GR is bydefault pointed to bank G2. Better use following codes: ESC . A Assign Latin/1 (area 160-255) to G2 ESC . U Assign DOS character set (area 160-255(*)) to G2 (*) There is still problem that C1 (128-159) is for control codes. At least some versions of Linux terminal driver interpreter one of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter all codes in C1 range or nothing them -- current situation confusing. Notice specially cursor control codes: IND (8/4 or 0x84), RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) |But it isn't that way :-) |And codes 'ESC ( U', 'ESC ) U', 'ESC * U' and 'ESC + U' have already another |standard meaning (see later). |(both ESC - and ESC ) assigns G1 -- charset names are different. | Hmm. ESC - can assign areas 160-255 (32-127), | ESC ( can assign area 161-254 (33-126) -- yeas these are very confusing.) <...> |RFC 1345 lists following codes: | ESC ( U Assigns iso-ir-27 to G0 | ESC ) U Assigns iso-ir-27 to G1 | ESC * U Assigns iso-ir-27 to G2 | ESC + U Assigns iso-ir-27 to G3 |RFC 1345 don't list codes | ESC - U Assign {something} (160-255 (32-127)) to G1 | ESC . U Assign {something} (160-255 (32-127)) to G2 | ESC / U Assign {something} (160-255 (32-127)) to G3 RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give character set assigment codes for this. |[ Hmm. Perhaps I comment some other issues later. ] [ I still seems to be some issue not to be covered yet. :-) ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development.system, comp.os.linux.development,comp.terminals Path: cs.utk.edu!cssun.mathcs.emory.edu!emory!gatech!news.sprintlink.net!demon !doc.news.pipex.net!pipex!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3pp9va$8je@kronos.fmi.fi> References: <3ok74b$5en@nntp.interaccess.com> <3p9gne$mu7@uahcs2.cs.uah.edu> <3pn802$sc1@kronos.fmi.fi> <3pnf1l$nc@kronos.fmi.fi> In-Reply-To: Article <3pnf1l$nc@kronos.fmi.fi> of "Kari E. Hurtta" Organization: Finnish Meteorological Institute (FMI) Date: 22 May 1995 06:11:54 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: Character set assigments (Re: How do I display IBM PC characters?) hurtta@dionysos.fmi.fi (Kari E. Hurtta) writes: [ I'm still followuping myself :-) ] ||cford@laser01.cs.uah.edu (Chris Ford) writes in comp.os.linux.developments.apps: ||| Before you do your printf, print this: "\033(U" and it will switch |||to the DOS character set. "\033(B" will switch back. Or vice versa. <...> ||It should be more understandable if code is || ESC - A Assing Latin/1 (area (128)160-255) to G1 || ESC - U Assign DOS character set (area 128-255) to G1 { ESC - U is just my suggestion, only prefix ESC - is standard } |Because you want keep DEC special graphics in G1 (which is default for VT100), |and GR is bydefault pointed to bank G2. Better use following codes: | ESC . A Assign Latin/1 (area 160-255) to G2 | ESC . U Assign DOS character set (area 160-255(*)) to G2 { ESC . U is just my suggestion, only prefix ESC . is standard } |(*) There is still problem that C1 (128-159) is for control codes. | At least some versions of Linux terminal driver interpreter one | of these: CSI (9/11 or 0x9b) -- (IMHO -- it should interpreter | all codes in C1 range or nothing them -- current situation confusing. Notice specially cursor control codes: IND (8/4 or 0x84), | RI (8/13 or 0x8d) and NEL (8/5 or 0x85).) In article "Re: DO use ESC [ 11 m (was: Don't use ESC 11 m"... in groups comp.os.linux.development and comp.terminals Colin Plumb (at 30 Nov 1994 19:50:08 -0700) was giving information what indicates that perhaps correct prefix is ESC % which changes whole set (all 128 or 255 characters). So perhaps yeat better codes are something like: ESC % A Assigns Latin/1 to G2, enables C1 (128-159) as control range, Assigns US-ASCII to G0 ESC % U Assigns MS-DOS to range 32-255 (G0,G2 and C1), disables C1 as control range { previous codes are just my suggestions, not from many specification. Only prefix ESC % can be taken from ISO 6429 } Hmm. According same article prefix ESC ! can be used for assign C0 (0-31) and prefix ESC " can be used to assign C1 (128-159). By to way, what codes was to assign UTF-8 and UTF-1 Was it ESC % {something} I think that I have hear code for UTF-1 to be assigned officially. <...> ||RFC 1345 lists following codes: || ESC ( U Assigns iso-ir-27 to G0 || ESC ) U Assigns iso-ir-27 to G1 || ESC * U Assigns iso-ir-27 to G2 || ESC + U Assigns iso-ir-27 to G3 ||RFC 1345 don't list codes || ESC - U Assign {something} (160-255 (32-127)) to G1 || ESC . U Assign {something} (160-255 (32-127)) to G2 || ESC / U Assign {something} (160-255 (32-127)) to G3 |RFC 1345 lists MS-DOS character set (charset: IBM437), but don't give |character set assigment codes for this. ||[ Hmm. Perhaps I comment some other issues later. ] |[ I still seems to be some issue not to be covered yet. :-) ] [ Perhaps I not followup myself -- I think that is going to be monology :-) ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development,comp.terminals Path: cs.utk.edu!gatech!swrinde!pipex!sunic!news.tele.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3bjv6b$mf4@kronos.fmi.fi> References: <784.2EDBB0B0@purplet.demon.co.uk> <3bi0he$c6v@trane.uninett.no> <3bi58q$8fv@kronos.fmi.fi> <3bjdl0$lfd@nyx10.cs.du.edu> Organization: Finnish Meteorological Institute (FMI) Date: 1 Dec 1994 07:49:31 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: DO use ESC [ 11 m (was: Don't use ESC [ 11 m was: Re: using the V colin@nyx10.cs.du.edu (Colin Plumb) writes: > It seems that the standard escape sequence looks like: > CSI P P P ... P I...I F > Where P are "parameters" taken from the 0x30..0x3F range (0123456789:;<=>?) > I are magic modifier flags that can totally change the meaning of the escape > sequence, taken from 0x20..0x2F ( !"#$%&'()*+,-./) > And F is a final letter from 0x40..0x7E (@A..Z[\]^_`a..z{|}_) which specifies > what the escape sequence is all about. Thanks. Yes. I was little careless. For character set changing DEC uses I modifiers and that F final letters. > There are 94-character sets (0x21..0x7E) and 96-character sets (0x20..0x7F). > You can have 4 of these floating around, G0, G1, G2 and G3. The 0x20..0x7F > and 0xA0..0xFF ranges are available to have these sets mapped into them. > When you see a "0x3F", for example, you have to figure out which set (G0, > G1, G2 or G3) is mapped into that space, and then figure out which character > set is in force there. > It's a bit like the 4 segment registers on the 8086. > 94-character sets are mapped in with ESC ( F, ESC ) F, ESC * F and ESC + F. > These are the G0..G3 slots, respectively. There's also an overflow range > which is used, ESC ( ! F, etc. 94 -character sets seems to be (in VT420): B US-ASCII %5 DEC Multinational Following character sets haven't mentioned are they 94 or 96 character set -- I think that these are 94 -character sets: 0 DEC special graphics > DEC Technical < user-preferred supplemental (*) And also following national character sets (available only in national mode): A UK-ASCII (ISO United Kingdom) 4 DEC Dutch 5 DEC Finnish R ISO French 9 DEC French Canadian K ISO German Y ISO Italian 6 DEC Norwegian/Danish ' ISO Norwegian/Danish %6 DEC Portuguese Z ISO Spanish = DEC Swiss (*) DEC Multinational or ISO Latin/1 (selectable with DCS ... ST codes). > 96-character sets can only be mapped to the G1..G3 slots. That uses > ESC - F, ESC . F and ESC / F. The "F" assignments are independent of > the assignments for the 94-character sets. 96 -character sets seems to be (in VT420): A ISO Latin/1 > I think the default startup is supposed to be G0 in 0x21..0x7E and G1 in > 0xA0..0xFF, but I'm not finding it documented. That is how VTxxx -series terminals does it. > There are also multi-byte character sets, using either 94 or 96 > characters, selected with ESC $ F, ESC $ ) F, ESC $ * F and ESC $ + F > for the 94-character case, and ESC $ - F, ESC $ . F and ESC $ / F for > the 960-character case. You mean: ... and ESC $ / F for the 96-character case. > Now, what I don't understand is how 8-bit character sets work. RFC 1345 > specifies rather a lot of them, and generally uses the 96-character escapes > for them, but there are a few 94-character escapes specified. > In particular, ESC ( t and ESC ( | specify the NAPLPS and T.101-G2 > character sets, which are 8 bits. > I could reconcile this if the G sets had room for two banks of characters > (low and high), and 7-bit sets loaded both identically, while 8-bit > sets loaded them differently, and the various shift functions fetched > from the corresponding bank. But I can't find it referred to anywhere. At least codes ESC ) < ESC * < ESC + < ESC ) %5 ESC * %5 ESC + %5 changes both low and high side of banks (I think that I don't have used other codes for selecting 8-bit character sets.) I don't have tried use high side of bank when to bank have assigned 7-bit character set. > Anyway, I don't think I've made any suggestions or asked any questions, > but maybe this information dump will help some other people. -- - Kari E. Hurtta / Elämä on monimutkaista Kari.Hurtta@Fmi.FI puh. (90) 1929 658 {hurtta,root,Postmaster}@dionysos.fmi.fi =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.unix.admin,comp.terminals Path: cs.utk.edu!gatech!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu !math.ohio-state.edu!cs.utexas.edu!convex!cnn.exu.ericsson.se !erinews.ericsson.se!sunic!sunic.sunet.se!news.funet.fi!news.csc.fi !kronos.fmi.fi!dionysos.fmi.fi!hurtta Message-ID: <3s15pj$4cs@kronos.fmi.fi> References: <3rli9f$3qd@linet02.li.net> NNTP-Posting-Host: dionysos.fmi.fi Organization: Finnish Meteorological Institute (FMI) Date: 18 Jun 1995 12:22:11 GMT From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Subject: Re: extended ascii characters [ Added comp.terminals as receiver. ] steven.g.johnson.1@gsfc.nasa.gov (steve johnson) writes in comp.unix.admin: |In article <3rli9f$3qd@linet02.li.net>, cagenjo@scls1 (Agenjo) wrote: |> |> I am part of a team setting up 54 libraries on an Internet system. I |> designed a welcome screen (using a DOS text editor) that I hoped would |> greet new users. I used some extended ascii characters to create a nice |> graphic, but when our sysadmin loaded it in, the characters we see upon |> login are not what I used - they have become numbers, etc. | | unfortunately, different systems map differently. | |> He doesn't |> think there is a way for his UNIX SunOS to properly display my file. |> Does anyone know of a way to do this? | i'm no expert on this, but what you probably want is one of the isolatin | (iso8859) character sets. ascii is a proper subset of iso8859-1. [ My answer is partially terminal specific and partially uses document "ISO International Register of Coded Character Sets To Be Used With Escape Sequences". Sorry. ] For drawboxes ('nice graphics') he probably want play special graphics sets such as what is in VT100. ie -- Assign special graphic set to back G1 ESC ( 0 ESC is 0033 in octal -- select bank G1 for characters 32-127 SO SO is 0016 in octal -- For boxes you can now use character upper left corner: 0154 in octal, 0x6C in hex upper right corner: 0153 in octal, 0x6B in hex lower left corner: 0155 in octal, 0x6D in hex lower right corner: 0152 in octal, 0x6A in hex horizontal line: 0161 in octal, 0x71 in hex (characters 0157 - 0163 have horizontal lines) vertical line: 0170 in octal, 0x78 in hex -- To return US-ASCII, selext bank G0 for characters 32-127 SI SI is 0017 in octal (This assumes that in G0 have US-ASCII, if it don't include US-ASCII, you can assign it with ESC ( B ESC is 0033 in octal) That Special graphics set is DEC -specific, but for example (in theory) xterm also supports it. To assign Latin/1 you need VT300 or better: -- First assign US-ASCII to bank G0 ESC ( B ESC is 0033 in octal -- Select bank G0 for characters 32-127 SI SI is 0017 in octal -- Assign Latin/1 range 160-255 to bank G2 ESC . A ESC is 0033 in octal -- Select bank G2 for characters 160-255 ESC } ESC is 0033 in octal * Now you have Latin/1 available - If you have shortage of banks and you don't want use special graphich in bank G1, you can assign Latin/1 range 160-255 to bank G1 ESC - A ESC is 0033 in octal and select bank G1 for characters 160-255 ESC ~ ESC is 0033 in octal =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!mp.cs.niu.edu !vixen.cso.uiuc.edu!howland.reston.ans.net!spool.mu.edu !bloom-beacon.mit.edu!crl.dec.com!crl.dec.com!nntpd.lkg.dec.com !regent.enet.dec.com!lasko From: lasko@regent.enet.dec.com X-From: (Tim Lasko, Digital Equipment Corp., Marlborough, MA) Subject: Re: Hebrew keyboard mapping Date: 6 JUL 95 13:54:55 Organization: Digital Equipment Corporation Message-ID: <3th8pc$dl@nntpd.lkg.dec.com> References: <3tfc1c$qvl@senator-bedfellow.MIT.EDU> In article <3tfc1c$qvl@senator-bedfellow.MIT.EDU>, igorlord@mit.edu (Igor Lyubashevskiy) writes... > >Hi, I am reading my VT420 manual, and it is totally clueless about the control >sequences that envolve Hebrew modes.... Does anyone at DEC or otherwise know >the correct values that go into those sequences ( CSI ? Pd h - like ). >Also, what are the mode identifiers of DECHEM (Hebrew encoding mode) and >DECNAKB (Greek Keyboard Mapping) since they are also mentioned to be either > 34, 35, or 57 in the description, index, or examples. There are actually four commands: DECRLM - Cursor Right to Left Mode ?34 DECHEBM - Hebrew (Keyboard) Mode ?35 DECHEM - Hebrew Encoding Mode ?36 DECNAKB - North American Keyboard Mode ?57 I'm looking at my VT5xx programming manuals (avaliable from Digital's ftp site) and I still see a few typos, unfortunately. >Finally, what is the function of >DECNAKB and DECHEBM (two very similar functions) when SET? The manual claims >that they function in an exactly opposite way to each other, which seems to me >highly illigical. They operate exactly as described. DECHEBM when reset and DECNAKB when set configure the terminal to use the North American keyboard layout. When DECHEBM is set and DECNAKB is reset, the corresponding "non North American" layout is configured. [Back when "specials" of the VT200 series terminals were done, commands to effect the similar operations (switching from a North American to a "non North American" keyboard for one) weren't always well rationalized with each other and these two got switched around. When those features were brought into the base VT400 unit, the definitions were kept the way they were for backwards compatibility with those units.] ------------------------------------------------------------------------------- Tim Lasko, Digital Equipment Corp., Marlborough MA (lasko@regent.enet.dec.com) My opinions are my own; the facts can speak for themselves. I'm on my own time. For Digital terminal support: call 1.800.777.4343 or email =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.mail.mime,comp.terminals,comp.software.international Path: cs.utk.edu!willis.cis.uab.edu!gatech!news.mathworks.com !newsfeed.internetmci.com!news.sprintlink.net!in2.uu.net!news.tele.fi !news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi!hurtta From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Date: 31 Aug 1995 06:28:25 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <423kq9$7e5@kronos.fmi.fi> Subject: Re: Security and MIME (Especially, metamail) [ Added comp.terminals and comp.software.international as receiver. ] NED@innosoft.com (Ned Freed) writes in comp.mail.mime: in article <01HUOLFW583090MTNI@INNOSOFT.COM> | |<...> |Designers of user agents (and as you say this is not limited to MIME agents or |even mail user agents) are caught between a rock and hard place on this issue. |On the one hand, escape sequences are often used in text objects and if you |block them the text ends up looking like garbage. This is especially burden- |some to users of Japanese, Chinese, and Korean character sets that employ |escape sequence switching -- block the switching sequences and the result is |completely useless. And on the other hand, not blocking such sequences opens |the door to these kinds of attacks. And by the way, they aren't limited to |programmable keys -- programmable answerback sequences can also be used and |are a lot more common on older, poorly designed equipment. |<...> It is quite easy to allow only sequences what have _syntaxticallly_ correct according of ISO 2022. Switching sequences of Japanese, Chinese, and Korean character sets uses ISO 2022 codes (as far I know, I haven't read specs of everyone -- only some). And answerback codes and such a like don't match syntaxtically to these codes. Notice that matching of syntaxtically don't require to be list of all possible codes. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!news.msfc.nasa.gov!newsfeed.internetmci.com!chi-news.cic.net !newsjunkie.ans.net!news.rmii.com!thoth.nilenet.com!ra.nilenet.com!gweisz From: gweisz@nilenet.com (Gideon Weisz) Subject: Hebrew e-mail, etc Date: 21 Dec 1995 04:28:02 GMT Organization: NileNet, Ltd Lines: 29 Message-ID: <4banoi$5k0@thoth.nilenet.com> For those who wish to do Hebrew e-mail, and already have a DOS PC and a UNIX internet node, things are now pretty easy, particularly if you have mskermit 3.14. We are even hoping that there will be a Hebrew mailing list soon. and with mskermit you can even compose hebrew messages in the recent English PINE easily, with the help of some scripts: kermit enables you to write in Hebrew characters and see them on your screen going the right way, while the scripts enable you to reverse their actual direction and right justify afterwards. some helpful files have been posted and are available on jerusalem1. e-brew.txt is a cookbook style info file e-brew.zip and its complementary ebrewadd.zip are a quickstart program package that can also serve as a convenient toolkit, and a later program and script package that improves it. the e-brew files are at ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.txt ftp://ftp.jer1.co.il/pub/software/msdos/communication/e-brew.zip ftp://ftp.jer1.co.il/pub/support/offline_mail/ebrewadd.zip the locations might change, but that's where the files are now. i don't want to use up bandwidth here, so anyone interested should contact me for a copy of the full announcement or anything else that i might be able to help with. gideon -- gideon weisz ïåòãâ [boulder, colorado] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.fonts,comp.std.internat Path: cs.utk.edu!gatech!news.mathworks.com!fu-berlin.de!news.belwue.de !news.uni-konstanz.de!Otto.Stolz Date: 9 Jan 96 11:39:09 GMT Organization: Universitaet Konstanz From: Otto Stolz To: kirshenbaum@hpl.hp.com References: <499ccn$102o@info4.rus.uni-stuttgart.de> <819044957snz@sahaja.demon.co.uk> X-URL: news:DKoI2K.1Ip@hplabsz.hpl.hp.com Message-ID: <30f253dd.0@news.uni-konstanz.de> Lines: 16 Subject: Re: Euro Currency Symbol (was: What does the "forin" char stand for?) Christopher Fynn (cfynn@sahaja.demon.co.uk) wrote: > Does anyone know if a new currency symbol for this monetary > unit has been decided upon? Stephen Baynes wrote: > All the teletext standards [...] give [...] as the European > Currency symbol [...] a glyph of a combined C and E evan@hpl.hp.com (Evan Kirshenbaum) wrote: > it is the one given in the > Unicode standard as character U+2040, "EURO-CURRENCY SIGN" In my copy of ISO/IEC 10646-1: 1993(E), this is character number 20A0; position 2040 is assigned to the CHARACTER TIE. Unicode, most probably, complies with ISO 10646-1, in this respect. Regards, Otto Stolz =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.protocols.kermit.misc Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!cs.utexas.edu !howland.reston.ans.net!gatech!newsfeed.internetmci.com!panix !news.columbia.edu!watsun.cc.columbia.edu!fdc From: fdc@watsun.cc.columbia.edu (Frank da Cruz) Date: 2 Apr 1996 15:22:20 GMT Organization: Columbia University Lines: 45 Message-ID: <4jrgnc$l49@apakabar.cc.columbia.edu> References: <4hf5q8$fkj@apakabar.cc.columbia.edu> <4hkbmo$gou@apakabar.cc.columbia.edu> Subject: Re: Kermit 3.14 + Kanji = troubles In article zippy@hairball.ecst.csuchico.edu (The Pinhead) writes: : In article <4hkbmo$gou@apakabar.cc.columbia.edu> : fdc@watsun.cc.columbia.edu (Frank da Cruz) writes: : :: set file character-set shift-jis <-- (Irrelevant) :: set terminal bytesize 8 <-- Yes, you need this :: set parity none <-- Ditto :: set terminal character-set transparent <-- Ditto :: :: This is exactly the set of commands you need. If it doesn't work, that :: most likely means that CP982 is not the active code page, or that you don't :: have DOS/V in Japanese mode. Another possibility is that the host that :: you are connecting to is not itself in 8-bit mode. For example, if it were :: a SunOS 4.x system, you would need to tell it to: :: :: stty pass8 :: :: before you could see 8-bit characters (the Kanji bytes of Shift-JIS have :: their 8th bits set to 1). Use the equivalent command ("stty cs8" or :: whatever) on other versions of UNIX or other operating systems. : : You've been really helpful, Frank! Thanks... However, just one : problem remains... MSKermit 3.14 seems to be remapping the line : drawing characters. The application is an ACUCobol program in : shift-jis mode. Using CKermit 5A(190) under Linux with Japanese : extentions, the line drawing character 0xc4 displays properly as a : horizontal bar, but under MSKermit 3.14 it displays as the katakana : character "to" (0x44). : Sorry for the delay in replying. This is from our informant in Japan: There are two problems: 1) There is no line drawing character in Japanese DOS/V character set (not only line drawing charters but also many symbols which are included US PC-DOS, e.g., copyright mark etc). 2) 0xc4 is officially defined as Katakana "to" in Shift-JIS code. If we change the mapping, it will cause many problems on many Japanese hosts where they use JIS-X-201 Katakana (Hankaku Katakana). (End quote) - Frank =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.os.linux.development.system,comp.std.internat,comp.terminals Path: cs.utk.edu!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!math.ohio-state.edu !howland.reston.ans.net!gatech!newsfeed.internetmci.com!in2.uu.net !nntp.inet.fi!news.funet.fi!news.csc.fi!kronos.fmi.fi!dionysos.fmi.fi !hurtta From: hurtta@dionysos.fmi.fi (Kari E. Hurtta) Date: 14 Apr 1996 10:29:40 GMT Organization: Finnish Meteorological Institute (FMI) Message-ID: <4kqk2k$5tm@kronos.fmi.fi> References: <4k73al$2mo@portal.gmu.edu> <4k87g6$5n@cortex.dialin.rrze.uni-erlangen.de> <4kandi$rm@cortex.dialin.rrze.uni-erlangen.de> <4kdcda$1fe@cortex.dialin.rrze.uni-erlangen.de> <316eb9a7.13321510@news.ucs.ubc.ca> In-Reply-To: Article <316eb9a7.13321510@news.ucs.ubc.ca> of Eric Gisin Subject: Re: Linux and UNICODE? [ Added comp.terminals as receiver. ] ericg@unixg.ubc.ca (Eric Gisin) writes in comp.os.linux.development and comp.std.internat: <...> | I thought stateful encodings were added to standard C at IBM's request, whose | EBCDIC-based multibyte character sets require it. What is ISO 2022, and is | anyone using it? I wouldn't want to see GNU libc implement something that's | never going to be used. <...> It is something what is used for example in Digital VT series terminal for selecting and managing character sets... Look for example VT300 series or newer. Linux's console driver implementation does not count (if it is not better after that when I last time looked it :-)) ISO 2022 is used also in Chinize and Japanise character sets. So different parts of ISO 2022 are definately in wide use... =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Article 8995 of comp.dcom.telecom: Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!telecom-request Date: 26 Sep 93 16:03:52 GMT From: jjmhome!pig!die@transfer.stratus.com (Dave Emery) Newsgroups: comp.dcom.telecom Subject: Re: Information Wanted on Six-bit Code Reply-To: jjmhome!pig!die@transfer.stratus.com Message-ID: Organization: Opinion Mongers Incorperated... Sender: telecom@eecs.nwu.edu Approved: telecom@eecs.nwu.edu X-Submissions-To: telecom@eecs.nwu.edu X-Administrivia-To: telecom-request@eecs.nwu.edu X-Telecom-Digest: Volume 13, Issue 665, Message 2 of 12 Lines: 69 In article johan@tts.lth.se (Johan M Karlsson) writes: > I just wonder if anybody know anything about the Six-bit code called > TTS, that was used by many newspapers in the 70's to receive stories > from the wire services. Like what does the letters TTS stand for? TTS standards for TeleTypeSetter. Indeed it is a 6-bit code which was developed by AT&T's now defunct Teletype subsidiary in the early 50s as a means of inputing news stories direct to Linotype machines. As such it incorporates the special control characters that operate Linotype machines such as upper rail and lower rail shifts and em space and en space. Originally in the days long before computers in the pockets of every reporter, the wire services had computerized systems that ran on mainframes for creating formated stock tables, sports box scores, racing information and other highly structured text. Sending this material in TTS code ready for direct input into a type casting machine saved local newspapers the services of several compositors and made it possible for them to publish reams of this sort of material at low cost. Later, in the 60's and early 70s the wire services developed computer programs to format (perform hyphenation and justification) their regular news feed into standard newspaper columns using Linotype control characters. Many of the newspaper oriented wire service wires (particularly the AP A wire) were transmitted in TTS code in this era and could be directly input to a Linotype typesetting machine. TTS code was popular for wire service distribution for another reason, it supported upper and lower case. The earlier Baudot alphabet only supported upper case which meant that a human being had to worry about getting the case correct in transcribing stories into type -- but TTS had the correct case already. TTS format paper tape in fact became a standard in the printing industry for input to composition equipment of later generations than Linotype machines. TTS represented an alphabet for encoding text formated for printing, and may still see some use for this purpose today. Teletype developed a modification of their model 15 workhorse wire service teleprinter to print TTS in upper and lower case on rolls of Teletype paper; this machine was called the model 20 monitor printer. Many newspapers which did not actually use TTS input to their typesetting machines for news stories used these machines to print out stories in upper and lower case for later entry by human compositors. Newspapers which used TTS input directly usually punched the TTS into 6 level paper tape for off line entry into Linotype machines. So a typical newspaper would have a monitor printer and a tape punch on each of their tts wires. TTS wire transmissions were usually low speed (66 or 75 wpm) at baud rates adjusted for the 8.42 element code. This resulted in some strange low baud rates that gave the designers of serial port boards for early minicomputers fits. TTS was largely replaced in the mid 70s by the high speed ASCII wire transmissions and by newspaper computerized composition systems which could do hyphenation and justification automatically and output text direct to optical typesetters. Remnents of it survive, however, in the standard ASCII format for transmitting wire service news stories which incorperates ASCII versions of some of the special typesetter control characters. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.dcom.telecom Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request Date: Sun, 26 Sep 1993 14:10:51 -0500 (cdt) Message-ID: Organization: TELECOM Digest Sender: telecom@eecs.nwu.edu Approved: telecom@eecs.nwu.edu X-Submissions-To: telecom@eecs.nwu.edu X-Administrivia-To: telecom-request@eecs.nwu.edu X-Telecom-Digest: Volume 13, Issue 664, Message 15 of 15 From: Brian D McMahon Subject: Re: kUPL@ TELEGRAFNYJ MODEM (095) 212-3937 > [Moderator's Note: This message came to me from Russia. I have no idea > at all what he is saying, except I think it has to do with a BBS or > public access site in Moscow. This was the entire text. Can someone > read it to me? PAT] > sRO^NO KUPL@ TELEGRAFNYJ MODEM > tEL: (095) 212-39-37 sIDORENKO sERGEJ. Hi, Pat. That would be "srochno kuplyu telegrafnyj modem," or "urgently (want to) buy a telegraphic modem." Signed by Sergej Sidorenko. I have no idea what a "telegraphic" modem is; I'm not up on the technical terminology. At a guess, the gentleman wants to buy a FAX modem. The message text, BTW, is in a format known as KOI-7, one of several mutually incompatible (sigh) methods of transmitting Russian Cyrillic text over the net. Upper and lower case are reversed, as you probably guessed. Brian McMahon Postmaster / Acad. Software Support Grinnell College Computer Services Grinnell, Iowa 50112 USA Voice: +1 515 269 4901 Fax: +1 515 269 4936 [Telecom Moderator's Note: You think then a 'telegraphic modem' would be a fax modem? My thanks to the 27 other responses I received to this query. I selected a few to use here which make a good representative sample of the lot. PAT] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Article 8992 of comp.dcom.telecom: Path: cs.utk.edu!gatech!howland.reston.ans.net!spool.mu.edu!telecom-request Date: Sun, 26 Sep 1993 02:14:29 -0400 From: anarres!gaarder@TC.Cornell.EDU Newsgroups: comp.dcom.telecom Subject: kUPL@ TELEGRAFNYJ MODEM (095) 212-3937 Message-ID: Organization: TELECOM Digest X-Telecom-Digest: Volume 13, Issue 664, Message 14 of 15 Passing that through a little transliteration program I wrote back during the coup in the Soviet Union (remember then? I was glued to my Usenet feed!) produces: Srochno kuplyu telegrafnyy modem Tel: (095) 212-39-37 Sidorenko Sergey. Which I read as offering to buy a modem. I'm not sure just what "srochno" means in this context; my dictionary defines it as "of term; to be paid at a fixed date; due; payable". "Kuplyu" means "I buy"; I don't know whether a "telegrafnyy modem" is a special kind of modem or just a modem in general. Why this is here is a puzzle; probably it was sent to the wrong address. Steve Gaarder gaarder@anarres.ithaca.ny.us [Moderator's Note: Well no, it was not sent to the wrong address. He wrote 'telecom-request@mintaka.lcs.mit.edu' which is just an alias that points at me. That is, he did not post to a newsgroup where it found its way to comp.dcom.telecom; some news program found it lacking authorization and shoved it to me. He mailed it direct, albiet to an alias I had forgotten existed, going back to the days of jsol. So he must think we can do something for him. Fancy that; he wants to buy a modem, and here I thought he was looking for publicity for his BBS or similar and decided to give it to him. PAT] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: Wed, 12 Mar 1997 14:09:38 GMT From: John Savard Newsgroups: alt.folklore.computers Subject: Re: Pre-1988 Windows ECS dski@cameonet.cameo.com.tw wrote: >Anyone know what Windows used for an extended character set before >ISO 8859-1 came along? >Dan Strychalski >dski@cameonet.cameo.com.tw No, but ISO 8859-1 came along in 1985, since the Amiga used it then. Only the plus and minus signs weren't agreed on (and, from the code chart, there in a position that ought to be used for the OE and oe ligatures, and, according to the manual for my inkjet printer, is so used in some Unix character set). The other possibility would have been to use the DOS character set, of course, which is still used in some Windows fonts. John Savard =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!news1.radix.net!news4.agis.net !www.nntp.primenet.com!nntp.primenet.com!news-feed.inet.tele.dk !news.nacamar.de!howland.erols.net!worldnet.att.net !cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!newsxfer3.itd.umich.edu !newsxfer.itd.umich.edu!yale!oitnews.harvard.edu!fas-news.harvard.edu !newspump.wustl.edu!newsreader.wustl.edu!not-for-mail Date: Wed, 26 Mar 1997 17:53:44 -0600 Message-ID: <3339B708.7B98B902@artsci.wustl.edu> References: <5ha085$mh0@reader.seed.net.tw> From: Tom Stepleton Subject: Re: Strange IBM glyphs (Was: Amiga) dski@cameonet.cameo.com.tw wrote: > I've heard it said the 5051 was conceived as a game machine. To look at > some of the characters IBM assigned to values in the ASCII control range > -- playing-card symbols and the like -- the idea doesn't seem so far- > fetched. And a bunch of I/O addresses were designated as the "game port." I wonder about this as well. Why did IBM use all of those bizarre glyphs for the control characters? The smiley faces (01,02), the game cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't seem to serve too much of a purpose and aren't that easy for Joe BASIC Game Programmer to put on the screen with only PRINT statements. I remember hearing somewhere long ago that all these card symbols and such originated on Wang word processing systems, but I don't trust my memory... --Tom +-----------+---------------------------+ ____ | Stepleton | ssteplet@artsci.wustl.edu |>-------|\__/_/__ +-----------+---------------------------+ \________} =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!metro.atlanta.com !feeder.chicago.cic.net!newsrelay.netins.net!news.ececs.uc.edu !newsfeeds.sol.net!news.maxwell.syr.edu!supernews.com!news Organization: All USENET -- http://www.Supernews.com Message-ID: <333c5a9c.12736765@news.comland.com> References: <5ha085$mh0@reader.seed.net.tw> <3339B708.7B98B902@artsci.wustl.edu> Date: Sat, 29 Mar 1997 00:00:30 GMT From: orestes@comland.com (William D. Leara) Subject: Re: Strange IBM glyphs (Was: Amiga) On Wed, 26 Mar 1997 17:53:44 -0600, Tom Stepleton wrote: > > I wonder about this as well. Why did IBM use all of those bizarre > glyphs for the control characters? The smiley faces (01,02), the game > cards (03-07), the gender signs (0B,0C), and the 16th notes (0E) don't > seem to serve too much of a purpose and aren't that easy for Joe BASIC > Game Programmer to put on the screen with only PRINT statements. > > I remember hearing somewhere long ago that all these card symbols and > such originated on Wang word processing systems, but I don't trust my > memory... You're right on the money. Check out the October 2, 1995 edition of FORTUNE magazine, specifically the interview with Paul Allen and Bill Gates. Bill says: "... we were also facinated by dedicated word processors from Wang, because we believed that general-purpose machines could do that just as well. That's why, when it came time to design the keyboard for the IBM PC, we put the funny Wang character set into the machine--you know, smiley faces and boxes and triangles and stuff. We were thinking we'd like to do a clone of Wang word-processing software someday." -- William Leara orestes@comland.com =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!vixen.cso.uiuc.edu!howland.erols.net!europa.clark.net !newsfeeds.sol.net!noc.nyx.net!nyx10.cs.du.edu!not-for-mail Date: 29 Mar 1997 12:27:22 -0700 Organization: University of Denver, Dept. of Math & Comp. Sci. Message-ID: <5hjqeq$raa@nyx10.cs.du.edu> NNTP-Posting-Host: nyx10.nyx.net From: snorwood@nyx10.cs.du.edu (Scott Norwood) Subject: origins of '\' (backslash) on keyboards? I remember reading with interest the thread on the origins of the '\' as a directory separator for M$-DOS (as opposed to the '/' used in UNIX). Now, here's another question: at what point did the backslash key become standard for computer keyboards? It's not on my typewriter, nor is it on the keyboard of my Apple II (whose keyboard is essentially the same as a teletype terminal), but it does exist on early DEC terminals (VT-100, etc.) and other equipment of the late-1970's vintage. How did this practice start? Does it have any roots in the UNIX convention of using the backslash to indicate that the following character should be treated literally (as in referring to filenames with spaces or other 'weird' characters in them).? -- Scott Norwood: snorwood@nyx.net, snorwood@balloon.ml.org, senorw@mail.wm.edu Lame Home Page #1: http://balloon.ml.org/ <-- School year only Lame Home Page #2: http://www.nyx.net/~snorwood/ <-- Regular page =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de !europa.clark.net!newsfeed.internetmci.com!news1.infoave.net!usenet Date: Sun, 30 Mar 1997 05:27:14 GMT Organization: Fantasy Farm Fibers Message-ID: <333df90f.107920290@news.swva.net> References: <5hjqeq$raa@nyx10.cs.du.edu> NNTP-Posting-Host: pem02-02.swva.net From: bernie@rev.net (Bernie Cosell) Subject: Re: origins of '\' (backslash) on keyboards? snorwood@nyx10.cs.du.edu (Scott Norwood) wrote: } } I remember reading with interest the thread on the origins of the '\' } as a directory separator for M$-DOS (as opposed to the '/' used in UNIX). } } Now, here's another question: at what point did the backslash key } become standard for computer keyboards? It has been on computer keyboards for a very long time. The early Model 33 Teletypes had forward-slash and reverse-slash on the keyboard. At the time, there was no particular preference for one over the other: the keyboard just included both slashes. [it also had "uparrow" and "backarrow", which a later revision of ASCII (at the time the model 37 came out) changed to caret and underscore, respectively]. Neither forward-slash nor reverse-slash were added for the convenience of computers... /bernie\ -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: 30 Mar 1997 20:44:25 GMT From: "Douglas W. Jones,201H MLH,3193350740,3193382879" Newsgroups: alt.folklore.computers Subject: Re: origins of '\' (backslash) on keyboards? From article <5hjqeq$raa@nyx10.cs.du.edu>, by snorwood@nyx10.cs.du.edu (Scott Norwood): > Now, here's another question: at what point did the backslash key > become standard for computer keyboards? ... The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory serves), and anything that copied the Teletype character set verbatim also had it. The Model 33 was the first ASCII terminal, and the 64 character subset of ASCII it supported (upper case only!) included the backslash. Doug Jones =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu !feed1.news.erols.com!howland.erols.net!swrinde!ihnp4.ucsd.edu !newshub.nosc.mil!news!mshapiro Organization: NCCOSC RDT&E Division, San Diego, CA References: <5hjqeq$raa@nyx10.cs.du.edu> <859668981snz@tnglwood.demon.co.uk> Message-ID: <1997Mar31.215753.24813@nosc.mil> Date: Mon, 31 Mar 1997 21:57:53 GMT From: Michael D Shapiro Subject: Re: origins of '\' (backslash) on keyboards? In article , Al Castanoli wrote: >Robert Billing writes: > >[...] > >: The key was certainly on the old ASR33, long before there were >: VT-anything terminals. I suspect that it antedates UN*X itself, and >: goes back to the deep magic at the dawn of ASCII. > >It was not on the Model 28 ASR, though ... I remember having to put >"backwards slash" in messages with Mod 28 ASR's and KSR's. Probably >a tradeoff in cramming ASCII into the Baudot bitstream. > The reverse solidus (back slash) showed up fairly early in ASCII code development, which was (as I recall) in the early 1960s. An excellent background on the history of character sets is in the book "Coded Character Sets" (I was about to give a more complete reference but forgot the author and publisher). Please let me know if you want a more complete reference. Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen symbol in place of the reverse solidus. -- Michael D. Shapiro, Ph.D. Internet: mshapiro@nosc.mil Code 4123, NCCOSC RDT&E Division (NRaD) San Diego CA 92152 Voice: (619) 553-4080 FAX: (619) 553-4808 DSN: 553-4080 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Date: 1 Apr 1997 16:12:05 GMT From: BBReynolds Message-ID: <19970401161101.LAA26059@ladder01.news.aol.com> Subject: Re: origins of '\' (backslash) on keyboards? The complete reference is Charles E. Mackensie, , The Systems Programming Series, Reading, Massachusetts, Addison-Wesley, 1980. Chapters 12 and 13 cover the development of ASCII; the reverse solidus a/k/a (or is that a\k\a??) backslash was part of the original specification. -- Bruce B. Reynolds, Systems Consultant: Founder of Trailing Edge Technologies--- Sweeping Up Behind Data Processing Dinosaurs =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Date: 1 Apr 1997 22:02:06 GMT Message-ID: <5hs0ku$mae$1@news.wizvax.net> From: John Wilson Subject: Re: origins of '\' (backslash) on keyboards? In article <5hmjb9$afo@flood.weeg.uiowa.edu>, Douglas W. Jones,201H MLH,3193350740,3193382879 wrote: >The Teletype Models 33 (ASR, KSR, etc) had a backslash (shift L, if memory >serves), and anything that copied the Teletype character set verbatim also >had it. The Model 33 was the first ASCII terminal, and the 64 character >subset of ASCII it supported (upper case only!) included the backslash. What particularly impressed me about that nasty little mechanical keyboard on the 33 was that its method of generating control characters was consistent, i.e. shift-K gave you "[" and if you wanted ESCape (^[) (N.B. NOT ALTMODE!), you typed ctrl-shift-K. And if I remember right, there was an interlock so that when you had ctrl and shift down, only those data keys that would now send something different would allow themselves to be pressed. Kinda cute. That answerback drum is something pretty special too... -- John Wilson 0,3 @ SID =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.er.usgs.gov!mcmcnews.er.usgs.gov !news.indiana.edu!chi-news.cic.net!arclight.uoregon.edu!europa.clark.net !cpk-news-hub1.bbnplanet.com!cam-news-hub1.bbnplanet.com !news.bbnplanet.com!howland.erols.net!EU.net!news.eunet.fi !news.microdata.fi!nntp.inet.fi!news.sci.fi!usenet Message-ID: <3340bb0f.65034735@news.sci.fi> Date: Tue, 01 Apr 1997 09:03:07 GMT From: Paul Keindnen Subject: Re: origins of '\' (backslash) on keyboards? mshapiro@nosc.mil (Michael D Shapiro) wrote: > >The reverse solidus (back slash) showed up fairly early in ASCII code >development, which was (as I recall) in the early 1960s. An excellent >background on the history of character sets is in the book "Coded >Character Sets" (I was about to give a more complete reference but >forgot the author and publisher). Please let me know if you want a >more complete reference. > >Incidentally, the Japanese equivalent of ASCII, JISCII, places the yen >symbol in place of the reverse solidus. While strictly speaking ASCII is a purely US standard, many national 7-bit character sets exist in the rest of the world, which are almost identical to the ASCII character set, but a few character positions are reserved for national variations. There are usually 6 to 9 character positions that differ from the ASCII representation. Crosshatch character, code 35 (decimal), is in many character sets the pound sign. Character codes after Z (91..94) and after z (123..126) are used for national variants. The backslash (92) is in the Dutch character set '1/2', while in the Finnish, German and Swedish character sets it is upper case O with two dots, in the French and Italian character set it is c with cedilla, in the Norwegian set it is O with a slash and in the Spanish set it is N with tilde. Apparently when the terminal manufacturers had to make keyboards for these languages and include keys for the "extra" characters, it was not economical to manufacture keyboards with different number of keys for each market, the extra keys in the US version were used to generate the same character code as the foreign version, but was labelled with the backslash etc. key cap, which otherwise would not have "deserved" an own key. Paul Keinanen =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Sender: Message-Id: <9207241339.AA04105@skinfaxe.diku.dk> Cc: Date: Fri, 24 Jul 1992 15:39:12 +0200 From: Steen Linden Subject: Re: Character sets In message <9207231551.AAgandalf08152@gandalf.uio.no> asked: > > [about character sets in email-directory support] > > If not, where do I start? Must I patch each DUA, or just some library? The ISO8859-1 conversion stuff is in libcommon.a. Take a look at isode-8.0/dsap/common/string.c. The most interesting functions are strprint() and iso8859print(). I haven't done any of the work you are requesting, though I could definitely use it. I was just looking through the code in search of the T.61 version of our Danish common national letters. Here they are by the way: T.61 X11 Keysym ISO8859/1 ASCII --------------------------------- \f1 ae 0xE6 { \e1 AE 0xC6 [ \f9 oslash 0xF8 | \e9 Ooblique 0xD8 \ \caa aring 0xE5 } \caA Aring 0xC5 ] --Steen =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Date: Thu, 03 Apr 1997 22:23:33 GMT Message-ID: <5i1au5$bpj@tor-nn1-hb0.netcom.ca> From: John Savard Subject: Re: origins of '\' (backslash) on keyboards? In <3340bb0f.65034735@news.sci.fi>, keinanen@sci.fi (Paul Keindnen) wrote: > > While strictly speaking ASCII is a purely US standard, many national > 7-bit character sets exist in the rest of the world, which are almost > identical to the ASCII character set, but a few character positions > are reserved for national variations. Yes, and these character sets belong to International Telegraph Alphabet No. 5, which is the international version of ASCII; so there is a worldwide standard based on ASCII. John Savard =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Message-ID: <5i1aop$bpj@tor-nn1-hb0.netcom.ca> Date: Thu, 03 Apr 1997 22:20:40 GMT From: John Savard Newsgroups: alt.folklore.computers Subject: Re: origins of '\' (backslash) on keyboards? snorwood@nyx10.cs.du.edu (Scott Norwood) wrote: > I remember reading with interest the thread on the origins of the '\' > as a directory separator for M$-DOS (as opposed to the '/' used in UNIX). > Now, here's another question: at what point did the backslash key > become standard for computer keyboards? Well, back in 1964, when the original ASR-33 Teletype was produced-- when ASCII was invented, in other words--the backslash was part of the character set. Back then, the caret was instead an up arrow (which it should have remained, being useful as an exponentiation operator); the underscore was an arrow pointing left; and there were no lowercase characters; ` { | } and ~ did not exist yet. However, in addition to DEL, the last few characters before it were controls as well: ACK, ESC, and ALT MODE then are now printing characters, and a different control character is used for ESC. The last 8 of the first 32 characters did not have their present meanings; they were just S0 through S7. John Savard =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: alt.folklore.computers Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!newsfeed.nacamar.de !news-feed.inet.tele.dk!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com !news-peer.sprintlink.net!news.sprintlink.net!sprint!howland.erols.net !spring.edu.tw!feeder.seed.net.tw!reader.seed.net.tw!!dski Organization: Cameo Communications, Inc. Message-ID: <5iujss$h0h@reader.seed.net.tw> NNTP-Posting-Host: 192.72.104.4 Date: 15 Apr 1997 00:59:08 GMT From: dski@cameonet.cameo.com.tw Subject: Re: ASCII History - No Cents? John Savard (seward@netcom.ca) wrote -- > Since ASCII was originally developed in 1963 for US use, the cents > sign would perhaps have been a useful thing to include. Both ISO and ASA (now called ANSI) began work on text encoding standards in 1961. ASA being an ISO member body, it seems likely that the work was coordinated to some degree. The 1963 ASA standard was for a six-bit uppercase-only version of ASCII; seven-bit "ASCII" seems to have been adopted first in *Europe*, in the form of ECMA-6, which the European Computer Manufacturers Association approved in 1965. The earliest seven-bit U.S. version I know of dates from 1968. This is also the year in which President Johnson mandated ASCII for the federal government's computer operations. > I have felt that the following assignments would make a good 'National > Use' version of ASCII for North American English-language word > processing, considering the keyboard arrangement: Which keyboard arrangement? Before IBM got into the microcomputer sca^H^H^Hbiz, many non-alphanumeric marks were not where you find them now. Most digital keyboards used ASCII-based pairings (["] with [2]; [&] with [6]; ['] with [7]; [(] and [)] with [8] and [9]; etc.). IBM used a layout similar to that of the Selectric. > Of course, I am deeply distressed both by the placement of the > multiplication and division signs in eight-bit ASCII where OE and oe > clearly belong... > > and, in the opposite direction, by the fact that the only standard > eight-bit ASCII is totally devoted to foreign-language word > processing. I wanted AND, OR, less than or equal to, not equal to, > greater than or equal to, and other symbols useful for _programming > languages_ ( X and -^H:, although useful for ALGOL, hardly count ) in > an eight-bit ASCII. Along, of course, with the _Greek_ alphabet [...] Code Page 437! You have it! I've seen so many different characters used for AND, OR, NOT, et al, in *typeset* material, I wonder if they just couldn't agree on which ones to use. Did these get standardized when I wasn't looking? > (although LLL 8-bit ASCII isn't perfect either, as APL should instead > get a character set of its own). "8-bit ASCII" is a contradiction in terms, I believe. What's LLL? -- Dan Strychalski =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Path: utkcs2!stc06.ctd.ornl.gov!news.he.net!news.maxwell.syr.edu !news-peer.sprintlink.net!news.sprintlink.net!Sprint!howland.erols.net !europa.clark.net!newsfeed2!argos.tel.hr!news Date: 7 May 1997 19:36:08 GMT Organization: Croatian Post & Telecommunications Message-ID: <01bc5b1d$4d7c4420$86e11dc3@gost.hr> References: <01bc5620$cf1a97e0$62e31dc3@gost.hr> <01bc56f2$19bb2220$d2e31dc3@gost.hr> <5kepgr$rgf@neptune.theplanet.co.uk> NNTP-Posting-Host: ac23-p1-zg.tel.hr From: "Bernard Grgic" Subject: Re: Need help with Fonts in Hyper Terminal Win95 Only monospaced (unproportional) fonts are listed in Hyper Terminal font set, NOT every TT Font, as I expected in the beginnig. I had to redesigne one of them (make Croatian characters instead of some other characters like square brackets) The problem was in a fact that I need to communicate with UNIX through code page CP437, not CP852, where I have Croatian characters, by default. I have VGA driver for CP437 with (redesigned) Croatian characters, but it does not work with programs in graphic mode. Greetings, Bernard. oliver st.john wrote in article <5kepgr$rgf@neptune.theplanet.co.uk>... > > >Do not vaste your time reading the text below! > >The problem has been solved, successfuly. > >Bernard > > [ > [ HOW? I'm sure a few of us would like to know... > [ > > >Bernard Grgic wrote in article > > <01bc5620$cf1a97e0$62e31dc3@gost.hr>... > >> Hi, > >> How can I choose other TT Font which is not specified in Hyper Terminal. I > >> need to do it if I want to have Croatian characters on the screen, when > >> communicate with UNIX. I have such characters in TT Fonts but they are not > >> listed in Hyper Terminal font list. > >> > >> If anyone can help or give me any suggestion, please, do it. > >> Thank you. > >> Bernard =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.lang.pl1 Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!newsfeeds.sol.net !news.maxwell.syr.edu!feed1.news.erols.com!news.ecn.uoknor.edu !munnari.OZ.AU!news.mel.connect.com.au!harbinger.cc.monash.edu.au !news.rmit.EDU.AU!goanna.cs.rmit.edu.au!not-for-mail Message-ID: <5outhp$h6n$1@goanna.cs.rmit.edu.au> References: <33B2DACD.4279@ix.netcom.com> Organization: Comp Sci, RMIT University, Melbourne, Australia. Date: 27 Jun 1997 09:21:29 +1000 From: rav@goanna.cs.rmit.edu.au (robin) Subject: Re: CHARSET (48) vs CHARSET (60) In <33B2DACD.4279@ix.netcom.com>, dneubart@ix.netcom.com writes: >I'm trying to map the PL/I 48-character set to 60-character set. >I haven't been able to match anything more than > > .. (dot dot) to : (colon) > ,. (comma cot) to ; (semi-colon) > >Does anyone know the 48-character equivalents for > > > (greater than) > < (less than) > | (logical or) > etc. OTOMH, > is GT < is LT | is OR & is AND || is CAT >= is GE <= is LE ^= is NE ^ is NOT -> is PT ^> is NG ^< is NL Are there any others? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.lang.pl1 Path: utkcs2!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!vixen.cso.uiuc.edu !ais.net!news.maxwell.syr.edu!howland.erols.net!psinntp !clothos.candle.com!phobos.candle.com!news Organization: Candle Corporation Message-ID: <33B2FBEE.E9E@candle.com> References: <33B2DACD.4279@ix.netcom.com> Date: Thu, 26 Jun 1997 16:31:58 -0700 From: Eric Jackson Subject: Re: CHARSET (48) vs CHARSET (60) dneubart@ix.netcom.com wrote: > > I'm trying to map the PL/I 48-character set to 60-character set. Boy, this takes me back. These are the ones I can think of off hand: GT > LT < OR | LE <= GE >= NOT * NE *= CAT || AND & =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Path: cs.utk.edu!darwin.sura.net!spool.mu.edu!bloom-beacon.mit.edu!ai-lab !prep.ai.mit.edu!gnu Newsgroups: gnu.announce,gnu.utils.bug,comp.unix.shell, comp.unix.programmer,comp.unix.misc Followup-To: gnu.utils.bug Message-ID: <9312240149.AA11561@icule.UUCP> To: info-gnu@prep.ai.mit.edu Date: Fri, 24 Dec 93 01:49:20 GMT From: pinard%icule.UUCP@iro.umontreal.ca (Francois Pinard) Subject: Release: GNU recode version 3.3 Approved: info-gnu@prep.ai.mit.edu Here is my Christmas gift to GNU users. Best wishes to all of you! GNU recode 3.3 should soon be available on prep.ai.mit.edu, as file pub/gnu/recode-3.3.tar.gz. All reported bugs have been corrected. Thanks to all those who contributed comments or suggestions. recode converts files between character sets and usages. When exact transliterations are not possible, it may get rid of the offending characters or fall back on approximations. This program recognizes or produces nearly 150 different charsets, able to transliterate files between almost any pair. Most RFC 1345 charsets are supported. Please report bugs to: bug-gnu-utils@prep.ai.mit.edu Here is a list of user visible changes from version 3.2.4: * Charsets atarist, ebcdic-ccc, ebcdic-ibm and nextstep have been added. * Also, most RFC 1345 charsets and aliases are handled. That's a bunch! * Old ascii disappears because of RFC 1345's ascii, use ascii-bs instead. * Old maci disappears because of RFC 1345's macintosh, use applemac instead. * Charsets cccascii and cdcascii disappear, use ebcdic-ccc and ebcdic instead. * Recoding between latin1, ibmpc and applemac is (almost) reversible. * The texinfo documentation has been reorganized, this to be continued. * Long options are accepted, charset names may be abbreviated. * Option --list (-l) displays charsets, aliases and contents in many formats. * Option --strict (-s) asks for stricter, non-reversible recodings. * Option --graphics (-g) approximates ibmpc rulers with ASCII graphics. * Option --header (-h) produces C source for many recoding tables. * Option --auto-check (-a) reports about all possible recodings. * Option --ignore (-x) prevents a charset from being selected. * Execution has been sped up through step merging, hashing for charset names. * Many various buglets have been eradicated, portability increased. * Charsets may be edited out by modifying the Makefile only. * Configuration is made through the use of an external config.h file. -- Franc,ois Pinard ``Vivement GNU!'' pinard@iro.umontreal.ca About the League for Programming Freedom? Email me or lpf@uunet.uu.net [ Most GNU software is packed using the new `gzip' compression program. Source code is available on most sites distributing GNU software. For information on how to order GNU software on tape, floppy, or cd-rom, check the file etc/ORDERS in the GNU Emacs distribution or in GNUinfo/ORDERS on prep, or e-mail a request to: gnu@prep.ai.mit.edu By ordering your GNU software from the FSF, you help us continue to develop more free software. Media revenues are our primary source of support. Donations to FSF are deductible on US tax returns. The above software will soon be at these ftp sites as well. Please try them before prep.ai.mit.edu! thanx -gnu@prep.ai.mit.edu ASIA: ftp.cs.titech.ac.jp, utsun.s.u-tokyo.ac.jp:/ftpsync/prep, cair.kaist.ac.kr:/pub/gnu AUSTRALIA: archie.au:/gnu (archie.oz or archie.oz.au for ACSnet) AFRICA: ftp.sun.ac.za:/pub/gnu MIDDLE-EAST: ftp.technion.ac.il:/pub/unsupported/gnu EUROPE: irisa.irisa.fr:/pub/gnu, ftp.univ-lyon1.fr:pub/gnu, ftp.mcc.ac.uk, unix.hensa.ac.uk:/pub/uunet/systems/gnu, src.doc.ic.ac.uk:/gnu, ftp.win.tue.nl, ugle.unit.no, ftp.denet.dk, ftp.informatik.rwth-aachen.de:/pub/gnu, ftp.informatik.tu-muenchen.de, ftp.eunet.ch, nic.switch.ch:/mirror/gnu, ftp.funet.fi:/pub/gnu, isy.liu.se, ftp.stacken.kth.se, ftp.luth.se:/pub/unix/gnu, archive.eu.net WESTERN CANADA: ftp.cs.ubc.ca:/mirror2/gnu USA: wuarchive.wustl.edu:/mirrors/gnu, labrea.stanford.edu, ftp.digex.net:/pub/gnu, ftp.kpc.com:/pub/mirror/gnu, ftp.cs.widener.edu, uxc.cso.uiuc.edu, ftp.hawaii.edu:/mirrors/gnu, ftp.cs.columbia.edu:/archives/gnu/prep, col.hp.com:/mirrors/gnu, gatekeeper.dec.com:/pub/GNU, ftp.uu.net:/systems/gnu ] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: 19 Mar 1998 08:53:34 GMT From: "Casper H.S. Dik - Network Security Engineer" Newsgroups: comp.unix.solaris Subject: Re: Special Symbol Characters (copyright, trademark, etc.)? [[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]] R!ch writes: >On Wed, 18 Mar 1998, Akira Hangai wrote: >> How could I type a special symbol character such as copyright, >> trademark, registered, etc., in a program like Text Editor, Netscape >> Mail, or even ShellTool/Dt Terminal? >There's a section in one of the manuls that shows the Compose key >sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin= >gly, >I've forgotten which one. IIRC, you have to using an 8 bit locale >to display the characters. There's also the table in /usr/openwin/share/include/X11/Suncompose.h ComposeTableEntry compose_table[] = { ... Of course, as a native dutch person I miss teh ability to use compose i-j to create a "^?" (y with diaresis, compose " y) Note that you can also use the reverse of the compositions as the lookup routine will first sort the two characters on ascii values and then lookup he entry in the table. Casper =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: 21 Mar 1998 00:41:37 GMT From: "Richard L. Hamilton" Newsgroups: comp.unix.solaris Subject: Re: Special Symbol Characters (copyright, trademark, etc.)? In article , R!ch writes: > On Wed, 18 Mar 1998, Akira Hangai wrote: > >> How could I type a special symbol character such as copyright, >> trademark, registered, etc., in a program like Text Editor, Netscape >> Mail, or even ShellTool/Dt Terminal? > > > There's a section in one of the manuls that shows the Compose key > sequences for most of these (stuff like =A3, =A9, =E6, etc) - but irritatin= > gly, > I've forgotten which one. IIRC, you have to using an 8 bit locale > to display the characters. > Or just look at the compose_table[] initializer in /usr/openwin/include/X11/Suncompose.h You don't even have to understand C to figure that one out. > -- > R!ch (Email is flakey at present: use richardt@keaton.uk.sun.com) > | Richard Teer richard.teer@uk.sun.com | > | WWW: www.rkdltd.demon.co.uk | ftp> get |fortune 377 I/O error: smart remark generator failed Bogonics: the primary language inside the Beltway mailto:rlhamil@mindwarp.smart.net http://www.smart.net/~rlhamil =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= ------------------------------------------------------------------- Date: 27 Mar 1998 09:12:37 GMT From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 Bioengineer-Financier) Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Subject: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts I have successfully read Greek from Lynx on Windows using the Courier New Greek font from www.hri.org/fonts about two years ago - (yet it did not work with the www.goarch.org Greek pages - at least not back then - and they told me that since it worked with Windows, they weren't going to bother). But I have never succeeded in doing this with DOS and MSKermit. I have only tried this using the abcgrl program (I thought it should work because kdp works the same way for Japanese). Perhaps I have been setting the character set wrong in Lynx (heck if I even remember what I used in Windows). HRI had a codepage which doesn't look like the typical IBM codepages in name (I forget right now) so I was confused if it would work and if indeed IBM offers a different codepage inside Greece. Where would I be able to find the standard IBM codepage in the USA (preferably on the web)? Is it possible that the problem with the IBM codepage is that it uses an older format and not the (ISO) ELOT 928? (I saw the Greek fonts for Win 3.11 and I don't recall seeing it there, but I might be wrong.) Please excuse my confusion. - = - Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC BachMozart ReaganQuayle EvrytanoKastorian http://WWW.Dorsai.Org/~vjp2 vjp2@{MCIMail.Com|CompuServe.Com|Dorsai.Org} ---{Nothing herein constitutes advice. Everything fully disclaimed.}--- ---------------------------------------------------------------------- Date: 27 Mar 1998 15:33:44 GMT From: Frank da Cruz Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts MS-DOS Kermit has no built-in support for Greek. You would need to find and load a Greek code page that agrees with the host encoding, and then use "set terminal character set transparent". : Where would I be able to find the : standard IBM codepage in the USA (preferably on the web)? : Good question. If you find an answer, please be sure to post it. By the way, Kermit 95 does support Greek: both ELOT 927 and 928. - Frank ------------------------------------------------------------- Date: 30 Mar 1998 10:13:53 GMT From: vjp2@dorsai.org @smtp.dorsai.org (Vasos Panagiotopoulos +1-917-287-8087 Bioengineer-Financier) Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts Followup-To: comp.protocols.kermit.misc,grk.forthnet.users,soc.culture.greek I got gauss.cpi from www.hri.org/fonts but don't know if there's a way to make MS-Kermit alone use it. Since it doesn't work the way the MS-DOS help files say it shoul (but I don't have the Greek files the MS-DOS help files say I need - I'm told only NT 4.0 DOS is country-blind and has them all on one version - I was wondering if there is a web page to find them at? Or do I have to go out and buy "Greek MS-DOS"?). KDP is a Japanese-font utility which allows Kermit to read Japanese (You run Kermit with the DOS command "KDP MSKERMIT") I have used the ABCGRL.COM utility (TSR?) to use ELOT fonts in Emacs, VEdit and other DOS programs, but it doesn't seem to work with Kermit (linking to Lynx). In Windows, connecting to Lynx with a terminal emulator, works ok if I set IBM CODEPAGE and RAW MODE. But in MSKermit, this just beeps alot and splatters all over the page when Greek text is encountered. I am also confused, so sorry in advance. - = - Vasos-Peter John Panagiotopoulos II, Columbia'81+, Bioengineer-Financier, NYC ------------------------------------------------------------ Date: 30 Mar 1998 15:36:09 GMT From: Frank da Cruz Newsgroups: comp.protocols.kermit.misc, grk.forthnet.users, soc.culture.greek Subject: Re: MSDOS Kermit and Unix Lynx and Greek ELOT 928 Fonts In article <6fh6l5$1du@news3.euro.net>, Denis Liigeois wrote: : Just a question: ELOT 928 is ISO-8859-7. What is ELOT 927 ? : It is a 7-bit set (like ASCII) in which the lowercase Roman letters are replaced by uppercase Greek letters. - Frank - -------------------------------------------------------------- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Message-ID: <567ce$a2a1b.32c@news.kea.bc.ca> X-Newsreader: Microsoft Outlook Express 4.72.2106.4 X-Mimeole: Produced By Microsoft MimeOLE V4.72.2106.4 Date: Wed, 6 May 1998 10:42:03 -0700 From: "Michael Simms" Subject: New Euro Currency symbol and DEC terminals Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will support the new Euro Currency symbol. Such as which position in the character sets. Any information would be appreciated. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: 6 May 1998 23:34:42 GMT From: "T.E.Dickey" Newsgroups: comp.terminals Subject: Re: New Euro Currency symbol and DEC terminals Michael Simms wrote: : Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will : support the new Euro Currency symbol. Such as which position in the : character sets. Any information would be appreciated. without a hardware change, they won't (it's not an ISO-8859-1 character). -- Thomas E. Dickey dickey@clark.net http://www.clark.net/pub/dickey =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Date: 7 May 1998 01:55:26 GMT From: Jeffrey Altman Newsgroups: comp.terminals Subject: Re: New Euro Currency symbol and DEC terminals In article <6iqs2i$agh$2@clarknet.clark.net>, T.E.Dickey wrote: : : Michael Simms wrote: : : : : Does anyone have an idea as to how DEC terminals (VT 420, VT 340, etc) will : : support the new Euro Currency symbol. Such as which position in the : : character sets. Any information would be appreciated. : : without a hardware change, they won't (it's not an ISO-8859-1 character). It will have to be supported as a soft character set or by the addition of additional character sets such as ISO-8859-15 which do include the Euro. FYI, Kermit 95 1.1.17 will support all of the new ISO and IBM Code Page character sets, which include the "Euro". -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Organization: Flashnet Communications, http://www.flash.net Sender: sturgeon@199.165.143.240 Message-ID: <35a101b0.1820019719@news.flash.net> Date: Mon, 06 Jul 1998 17:02:13 GMT From: JonS@futuresoft.com (Jon Stugeon) Subject: Euro currency symbol & dumb terminals/emulators All, I've recently been reading about support for the new Euro currency symbol in the Windows 95/98 & NT O/Ss. This got me thinking if there will be any kind of standard for how legacy host applications will represent the Euro symbol. Obviously if the final display device is a physical dumb terminal (eg VT-220) then it won't know anything about the Euro symbol, but if an emulator is being used then it could be configured to display the Euro symbol in place of an existing character. So, which character would be replaced? My guess is that this would be done on an ad-hoc, host-to-host basis, but I'd be glad to be put right. Regards, Jon Sturgeon JonS@futuresoft.com =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Organization: Columbia University Message-ID: <6nr10p$sf$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu Date: 6 Jul 1998 17:20:25 GMT From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Subject: Re: Euro currency symbol & dumb terminals/emulators In article <35a101b0.1820019719@news.flash.net>, Jon Stugeon wrote: : All, : : I've recently been reading about support for the new Euro currency : symbol in the Windows 95/98 & NT O/Ss. This got me thinking if there : will be any kind of standard for how legacy host applications will : represent the Euro symbol. : : Obviously if the final display device is a physical dumb terminal (eg : VT-220) then it won't know anything about the Euro symbol, but if an : emulator is being used then it could be configured to display the Euro : symbol in place of an existing character. : : So, which character would be replaced? My guess is that this would be : done on an ad-hoc, host-to-host basis, but I'd be glad to be put : right. : : Regards, : Jon Sturgeon : JonS@futuresoft.com : Character-sets (including those with Euro support) are defined by the ISO as part of standard 8859. These are to be used by the host. IBM has defined new code pages for the inclusion of the Euro and Microsoft has added the Euro to its existing code pages. Emulators should not make up their own. Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Newsgroups: comp.terminals Subject: Re: Euro currency symbol & dumb terminals/emulators Date: 8 Jul 1998 20:04:28 GMT Organization: Columbia University Lines: 62 Message-ID: <6o0jcc$pqd$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu In article <35a3a03e.87610587@news.flash.net>, Jon Stugeon wrote: : : So you're expecting users of host applications to use character : translation features built-into terminal emulation software to choose : to map an arbitrary character in the *host* character set to the : appropriate character representing the Euro in the code page they are : using in their display font? Then if the host application needed to : display the Euro symbol *in addition* to an existing currency symbol : it would need to be modified to be aware of the configuration of the : user's emulation package? : : Surely if there was some kind of standard agreed upon for which : character the host applications will use to represent the Euro then we : wouldn't have another of those cases where the user doesn't get the : correct symbol just because his emulation software isn't configured : correctly. : : Or have I got hold of the wrong end of the stick here? The way that terminals (and emulation software) is supposed to work is that the host application instructs the terminal as to which character-set(s) should be loaded into the G0,G1,G2, and G3 character-set tables. These tables are then used to map a byte from the host to a particular character for display. If the local system does not support the character-sets used by the application, it must perform translation to a character-set that it does support. There are international standards for all of this. The ISO defined ISO 2022 more than 20 years ago to address the host to terminal assignment of character-sets and the mechanisms for switching between them. ISO 8859 defines the agreed upon International character-sets. Part 15 declares the newly formed Western European character set which includes the Euro. IBM maintains the Code Page Registry. As such they introduced new code pages for both ASCII and EBCDIC systems that include the Euro for use in their operating systems (DOS and OS/2 on the PC; OS/400; OS/390, ...). Microsoft maintains its own Code Pages for Windows which are registered with IBM as Code Pages 1250-1258. These are based on the ISO 8859 character-sets but include printable characters in the C1 range. And then, of course, Unicode has defined a position for the Euro in version 2.2 of that standard. (0x20AC) ISO 2022 was used as the basis for the character-set handling ANSI X3.64-1979 (since withdrawn) which was the basis for most Unix consoles and the DEC VT terminal line. It is also the basis of ISO-6429, which is the international standard which replaced ANSI X3.64-1979. Since FutureSoft is a manufacturer of terminal emulation software I would have expected you to know all this. How can Dynacomm emulate a VT terminal if it doesn't support this functionality? Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Sender: sturgeon@199.165.143.240 Message-ID: <35a4fe63.111711372@news.flash.net> References: <35a101b0.1820019719@news.flash.net> <35a28fdd.17881822@news.flash.net> <6nuogi$3n2$1@apakabar.cc.columbia.edu> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu> NNTP-Posting-Host: 199.165.143.240 Date: Wed, 08 Jul 1998 23:23:26 GMT From: JonS@futuresoft.com (Jon Stugeon) Subject: Re: Euro currency symbol & dumb terminals/emulators On 8 Jul 1998 20:04:28 GMT, jaltman@watsun.cc.columbia.edu (Jeffrey Altman) wrote: >Since FutureSoft is a manufacturer of terminal emulation software >I would have expected you to know all this. How can Dynacomm >emulate a VT terminal if it doesn't support this functionality? Thanks for the comprehensive reply, Jeffrey. DynaComm indeed emulates a VT terminal, including support for NRCs, DEC Supplemental/Graphics etc etc, but that does necessarily mean that everybody that works for the manufacturer has the benefit of and understands the years of history behind character set development. Furthermore, not everybody that works for FutureSoft works in terminal emulation. I am trying to understand what, if any, modifications would be necessary to provide "support for the Euro", that is the reason for my original enquiry. Regards, Jon Sturgeon JonS@futuresoft.com =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals Date: 9 Jul 1998 04:58:49 GMT Organization: Columbia University Message-ID: <6o1im9$cgj$1@apakabar.cc.columbia.edu> References: <35a101b0.1820019719@news.flash.net> <35a3a03e.87610587@news.flash.net> <6o0jcc$pqd$1@apakabar.cc.columbia.edu> <35a4fe63.111711372@news.flash.net> NNTP-Posting-Host: watsun.cc.columbia.edu From: jaltman@watsun.cc.columbia.edu (Jeffrey Altman) Subject: Re: Euro currency symbol & dumb terminals/emulators In article <35a4fe63.111711372@news.flash.net>, Jon Stugeon wrote: : : DynaComm indeed emulates a VT terminal, including support for NRCs, : DEC Supplemental/Graphics etc etc, but that does necessarily mean that : everybody that works for the manufacturer has the benefit of and : understands the years of history behind character set development. : Furthermore, not everybody that works for FutureSoft works in terminal : emulation. : : I am trying to understand what, if any, modifications would be : necessary to provide "support for the Euro", that is the reason for my : original enquiry. I apologize for assuming more knowledge than you have at your disposal. I assumed (obviously incorrectly) that either you would be asking this question because you are somehow involved in your company's terminal emulation development; or that you have spoken to your own developers before asking this query on the Net. While I am a strong believer is the open sharing of knowledge, I must admit that I am a bit hesitant to provide a direct competitor with information that will help it takes sales away from my product. On the other hand, I couldn't let someone comes up with yet another hack solution (that I would end up needing to support for a customer in five years) because of ignorance. Hope this thread has been useful. -- Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Newsgroups: comp.terminals References: <8aoi4p$uv1$1@nnrp1.deja.com> Message-ID: <8aooiv$2i5$1@newsmaster.cc.columbia.edu> Date: 15 Mar 2000 19:33:51 GMT Organization: Columbia University From: Jeffrey Altman Subject: Re: change from ascii to ansi character set in DOS window In article <8aoi4p$uv1$1@nnrp1.deja.com>, wrote: : Hi, : : I am running NT 4.0 SP3. My DOS window currently is displaying : the ASCII character set. However I want it to display the ANSI : character set. How do I do this? : The Console window is Unicode based. The font that is displayed is Unicode if you are using a TrueType font such as LucidaConsole or Code Page based (CP437, CP850, ...) if you are using raster fonts. The console application has a choice of writing to the screen using the active Code Page or Unicode. NT provides the proper translations. CP1252 is the Windows variation of ISO-Latin1 that you refer to as ANSI. To use this code page in your application, use SetConsoleCP() and SetConsoleOutputCP(). Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2 The Kermit Project * Columbia University 612 West 115th St #716 * New York, NY * 10025 http://www.kermit-project.org/k95.html * ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <8kc2os$fkm$1@as102.tel.hr> Date: Mon, 10 Jul 2000 18:43:50 GMT Organization: @Home Network Newsgroups: comp.terminals Message-ID: From: dls2 Subject: Re: Setting keyboard over esc sequences on VT510/520 "IdrEASY" wrote: > Hi! > > I need to set Croatian keyboard over esc sequence (SCS=Croatian/Slovenian > latin). > Also I know that "ESC(&3" are sequence for Russian cyrilic. > > I wrote little program with double loop and generate esc calls, but only > have success for > Russian keyboard. > > Please, help me. > > Bye! The "(" represents the (94-character) G0 character set. The ")" represents the (94-character) G1 character set. The "*" represents the (94-character) G2 character set. The "+" represents the (94-character) G3 character set. The "-" represents the (96-character) G1 character set. The "." represents the (96-character) G2 character set. The "/" represents the (96-character) G3 character set. Russian NRCS is "&5", not "&3". SCS NRCS is "%3". So you should be using "ESC(%3". -- Derrick Shearer ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.terminals References: <8kc2os$fkm$1@as102.tel.hr> <8kho8g$2lr$1@as102.tel.hr> <8khpnp$im8$3@news1.Radix.Net> Date: Wed, 12 Jul 2000 14:04:44 +0100 Organization: RDEL Newsgroups: comp.terminals Message-ID: <396C6CEC.ABE87C5E@rdel.co.uk> From: Paul Williams Subject: Re: Setting keyboard over esc sequences on VT510/520 Thomas Dickey wrote: > > IdrEASY wrote: > >> > >> Russian NRCS is "&5", not "&3". > >> SCS NRCS is "%3". > > > Sorry, on my terminal (&4 is Russian, but (%3 does nothing. > > what type of terminal is that? > (I'm assuming vt220) It says VT510/520 on the subject line, Tom. (Yes, I hate it when vital information is only mentioned on the subject line!) ////////////////////////////////////////////////////////////////////////////// Message-ID: References: NNTP-Posting-Host: mail.pharmapartners.nl Newsgroups: comp.mail.pine Date: 6 Oct 2000 07:57:19 GMT From: Villy Kruse Subject: Re: Pine and French characters On Thu, 5 Oct 2000 13:29:02 -0400, Gopi Sundaram wrote: >On Thu, 5 Oct 2000, Samuel W. Heywood wrote: > >> If the character set used in Windows is not backward-compatible >> with DOS, then Windows does not adhere to the standard. > >I don't know what standards you are talking about, but I'm glad that >Windows finally used the ISO standard, whereas DOS didn't. > Well, actualy windows tries to "improve" on iso-8859-1 and calls that windows-1252. The difference is that some values in the range 0x80 to 0x9f has been assigned to characters, which are missing in iso-8859-1 For example the euro sign is 0x80 in win1252, but doesn't exist in iso-8859-1. However it will be 0xA4 in iso-8859-15 aka latin-9. Check the alphabet soup at http://www.czyborra.com/ and see how standard the various standards really are. Villy ////////////////////////////////////////////////////////////////////////////// Newsgroups: comp.mail.pine Message-ID: References: Date: Mon, 9 Oct 2000 12:54:19 +0200 Organization: Knights of the Round Tuit From: "Alan J. Flavell" Subject: Re: Pine and French characters On Sat, 7 Oct 2000, Samuel W. Heywood wrote: > Thanks a lot for the URL. Now that I've read about QUOTED PRINTABLE I > understand that it probably would be best to load a code page for > handling this ISO-8859-1 character set. Excuse me but you're not quite with us yet. You would certainly be advised to load a code page that covers the Latin-1 repertoire; but the recommendation would be to load the cp850 code page, which covers this repertoire but it's _not_ the iso-8859-1 character coding itself. PINE knows how to mediate between the two, as we've already covered in this thread. There is a relatively obscure code page, cp819, which represents the iso-8859-1 character coding. However, if you load it, you are going to find quite a number of conventional DOS applications displaying bizarre characters in their menus etc, instead of the DOS "box drawing" characters which they expected. You'll find some brief (and old) notes of mine here http://ppewww.ph.gla.ac.uk/~flavell/iso8859/iso8859-pointers.html#cp819 but I don't recommend that. Unless you have some special requirement that we haven't discussed here, I recommend that you use cp850. cheers Alan //////////////////////////////////////////////////////////////////////////////