net.sourceforge.wohenchan.encoding
Class GB2312Encoding

java.lang.Object
  |
  +--net.sourceforge.wohenchan.encoding.GB2312Encoding
All Implemented Interfaces:
EncodingInfoInterface

public class GB2312Encoding
extends java.lang.Object
implements EncodingInfoInterface

Information about the GB2312-1980 encoding. The following is taken from ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf

 2.2.2: GB 2312-80
 
         This basic (simplified) Chinese character set standard
 enumerates 7,445 characters, 6,763 of which are hanzi separated into
 two levels. Hanzi in the first level are arranged by reading, and
 those in the second level are arranges by radical then total number of
 (remaining) strokes. GB 2312-80 is also known as the "Primary Set,"
 GB0 (zero), or just GB.
 
 o Row 1: 94 symbols
 o Row 2: 72 numerals
 o Row 3: 94 full-width GB 1988-89 characters (see Section 2.2.1)
 o Row 4: 83 hiragana
 o Row 5: 86 katakana
 o Row 6: 48 uppercase and lowercase Greek alphabet
 o Row 7: 66 uppercase and lowercase Cyrillic (Russian) alphabet
 o Row 8: 26 Pinyin and 37 Bopomofo characters
 o Row 9: 76 line-drawing elements (09-04 through 09-79)
 o Rows 16 through 55: 3,755 hanzi (Level 1 Hanzi; last is 55-89)
 o Rows 56 through 87: 3,008 hanzi (Level 2 Hanzi; last is 87-94)
 
 Compare some of the structure with JIS X 0208-1990, and you will find
 many similarities, such as:
 
 o Hiragana, katakana, Greek, and Cyrillic characters are in Rows 4, 5,
   6, and 7, respectively
 o Chinese characters begin at Row 16
 o Chinese characters are separated into two levels
 o Level 1 arranged by reading
 o Level 2 arranged by radical then total number of strokes
 
 The Japanese standard, JIS C 6226-1978, came out in 1978, which means
 that it pre-dates GB 2312-80. The above similarities could not be by
 coincidence, but rather by design.
         Appendix G (pp 318-344) of "Developing International Software
 for Windows 95 and Windows NT" by Nadine Kano illustrates the GB 2312-
 80 character set standard by EUC code (Microsoft calls this Code Page
 936). Code Page 936 incorporates the correction of the hanzi at 79-81,
 and the correction of the order of 07-22 and 07-23 (see Section 2.2.3
 for more details).
 

Version:
$Name: $ $Date: 2003/06/22 21:51:37 $
Author:
$Author: wtanaka $

Field Summary
static java.lang.String CANONICAL
           
(package private) static GB2312Encoding s_singleton
           
 
Fields inherited from interface net.sourceforge.wohenchan.encoding.EncodingInfoInterface
ALL_ENCODINGS
 
Constructor Summary
private GB2312Encoding()
           
 
Method Summary
 java.lang.String getCanonicalString()
           
static GB2312Encoding getInstance()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CANONICAL

public static final java.lang.String CANONICAL
See Also:
Constant Field Values

s_singleton

static final GB2312Encoding s_singleton
Constructor Detail

GB2312Encoding

private GB2312Encoding()
Method Detail

getInstance

public static GB2312Encoding getInstance()

getCanonicalString

public java.lang.String getCanonicalString()
Specified by:
getCanonicalString in interface EncodingInfoInterface