net.sourceforge.wohenchan.encoding
Class EncodingGuesser
java.lang.Object
|
+--net.sourceforge.wohenchan.encoding.EncodingGuesser
- public class EncodingGuesser
- extends java.lang.Object
- Version:
- $Name: $ $Date: 2003/06/22 17:40:26 $
- Author:
- $Author: wtanaka $
Field Summary |
private java.lang.String |
ENUS
|
private java.lang.String |
PINYIN
|
private java.lang.String[] |
pinyinArray
|
private java.lang.String |
UPLUS
|
private java.lang.String |
ZHCN
|
private java.lang.String |
ZHTW
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ENUS
private final java.lang.String ENUS
- See Also:
- Constant Field Values
ZHCN
private final java.lang.String ZHCN
- See Also:
- Constant Field Values
ZHTW
private final java.lang.String ZHTW
- See Also:
- Constant Field Values
PINYIN
private final java.lang.String PINYIN
- See Also:
- Constant Field Values
UPLUS
private final java.lang.String UPLUS
- See Also:
- Constant Field Values
pinyinArray
private final java.lang.String[] pinyinArray
EncodingGuesser
public EncodingGuesser()
addRecognizedEncoding
public void addRecognizedEncoding(EncodingInfoInterface encodingInfo)
- Adds the given encodingInfo as a recognized encoding for this
Guesser.
guessEncodings
public java.lang.String[] guessEncodings(byte[] input)
throws NoLikelyEncodingException
- This method returns the likely encodings for a given string. It
might be a good idea to optimize for the case that the byte
array passed in has, as a prefix, the byte array passed into the
last call to this method.
Should this return an EncodingInfoInterface[] instead?
- Returns:
- WHERE (self.length > 0) Now, just return the Encoding name,
which has the most likelihood. If we want we can change it to an
array, return all the encodings.
- Throws:
NoLikelyEncodingException
- if there are no likely
encodings for the given input.
gbProb
private int gbProb(byte[] inputString)
utf8Prob
private int utf8Prob(byte[] inputString)
pinyinProb
private int pinyinProb(byte[] inputString)
lookForPinyin
private boolean lookForPinyin(java.lang.String in)
main
public static void main(java.lang.String[] args)