I18N FAQ for Java Programming Language
I18N FAQ for Java Programming Language
Where can I find Java I18N FAQ ?
Where can I find good tutorial for Java i18n ?
After translating the properties file, the characters still look junk in the GUI. Why ?
What is the encoding of Java string ?
How will I find my default system native encoding ?
How can I print the Unicode values of my Java string in \uXXXX format (for DEBUG) ?
How do I know if my JRE/JDK is the international version ?
How do I convert the NCR format ( dddd; ) to Java string ?
How to setup my emulator for Midlet java applications ?
My awt applet does not show japanese content but works ok for French. Whatz wrong ?
Can you please tell me how to convert the english messages (Ascii) to Some other languages?(Unicode)
I am getting question mark in servlets request.getParameter("TextField")
Any good books on Java internationalization ?
After translating the properties file, the characters still look junk in the GUI. Why ?
After doing the translation you have to convert the properties file to \uXXXX format. You can use the native2ascii tool to do this conversion. See http://java.sun.com/j2se/1.3/docs/tooldocs/solaris/native2ascii.html for more details.
Other rendering issues like missing fonts are also common cause for this problem.
Other rendering issues like missing fonts are also common cause for this problem.
What is the encoding of Java string ?
Java string store data in UTF-16 Unicode format. That is, it stores data in 16-bit buffers. During input/output this data is converted from UTF-16 to the systems native encoding (by default). User can alter this behavior by passing encoding arguments to the OutputStreamWriter/InputStreamReader etc.See http://java.sun.com/docs/books/tutorial/i18n/text/stream.html for more detail.
How will I find my default system native encoding ?
Java determines default encoding from the system, however users can overwrite it in the command line argument while starting the virtual machine. Use the following DEBUG line to figure out the default encoding.
System.out.println(System.getProperty("file.encoding"));
But "file.encoding" property is an implementation-dependent property. The safe ways of determining the default encoding are decribed at http://java.sun.com/j2se/corejava/intl/reference/faqs/index.html#default-encodingHow do I know if my JRE/JDK is the international version ?
The easy way to find out is to look under "lib" directory where you installed your JRE or JDK. If you see an "i18n.jar" file then you have the international version. If not, download the international version, since most of the asian encoding are not supported by English version of JRE/JDK.
Note: In JRE 1.4 and latter, lib/i18n.jar was replaced by lib/charsets.jar and lib/ext/localedata.jar
Note: In JRE 1.4 and latter, lib/i18n.jar was replaced by lib/charsets.jar and lib/ext/localedata.jar
How to setup my emulator for Midlet java applications ?
Refer this article on developing multilingual wireless java application at
http://wireless.java.sun.com/midp/ttips/customize/
My awt applet does not show japanese content but works ok for French. Whatz wrong ?
There is probably two reasons for this...
1) You are not having international version of the JRE. The easy way to check this is to see if you see an i18n.jar under the "lib" directory where the JRE is installed. If you dont see one, download the international version of JRE.
2) The second posibility is that you are trying to view this applet on an English machine. Even though the IE will automatically download the Japanese fonts, It only enables the Japanese characters on the HTML page to be seen properly. The Java applet doesnot recognize these downloads.
1) You are not having international version of the JRE. The easy way to check this is to see if you see an i18n.jar under the "lib" directory where the JRE is installed. If you dont see one, download the international version of JRE.
2) The second posibility is that you are trying to view this applet on an English machine. Even though the IE will automatically download the Japanese fonts, It only enables the Japanese characters on the HTML page to be seen properly. The Java applet doesnot recognize these downloads.
Can you please tell me how to convert the english messages (Ascii) to Some other languages?(Unicode)
There are two steps involved here.
Step 1). Translate the English text to the other language (like spanish). It is usually done by Human translators. If you dont have access to a translator and just want an approximate translation you can use the Altavista web site http://world.altavista.com/ or google.
Step 2) Convert the translated text to the Unicode escape format. To do that use the native2ascii (JDK- Java development kit) tool. You need to specify the encoding of the translated text in the step-1.
For example, If you use AltaVista and from the result page you click on save as, you will be saving this data in UTF-8 format. (See the encoding menu in the "save as" dialog. in I.E).
Now do the convesion using native2ascii. e.g. native2ascii -encoding UTF8
Then open the file in editor (like notepad) and remove all the html text that you dont want.
Step 1). Translate the English text to the other language (like spanish). It is usually done by Human translators. If you dont have access to a translator and just want an approximate translation you can use the Altavista web site http://world.altavista.com/ or google.
Step 2) Convert the translated text to the Unicode escape format. To do that use the native2ascii (JDK- Java development kit) tool. You need to specify the encoding of the translated text in the step-1.
For example, If you use AltaVista and from the result page you click on save as, you will be saving this data in UTF-8 format. (See the encoding menu in the "save as" dialog. in I.E).
Now do the convesion using native2ascii. e.g. native2ascii -encoding UTF8
Then open the file in editor (like notepad) and remove all the html text that you dont want.
I am getting question mark in servlets request.getParameter("TextField")
Looks like the servlet is not able to recognize the encoding of the posted data. One way to workaround is to get the hi-end bytes from the Java String and construct the String explicitly passing the Big5 encoding. The code will look something like below...
btw, the above suggestion assumes the getPropery() is returning the mutli-byte string but not recongnizing the encoding and stores each byte in one Java unicode char. If you are getting question marks already at this point them the above workaround maynot have any effect. Use the code from http://www.i18nfaq.com/java.html#6 to dump the Unicode and see what the values are. If it says \x003F\x003F then your "pocket IE" itself is converting them to question-mark before posting. We may have to research how to make "pocket IE" recognize Big5.
Note: The workaround shown above is no longer needed Instead, use the ServletRequest.setCharacterEncoding method, which was introduced in Servlet 2.3.
byte [] hibytes = ((String)
request.getParameter("TextField")).getBytes
("iso-8859-1");
String pram = new String(hibyes, "Big5");
The getBytes("ISO-8859-1") gives you the the hi-end bytes. The use of "ISO-8859-1" here is just a hack to get the 8-bits of each Java chracter. After this call your byte array will have the multi-byte user has posted and creating a Java string using the "Big5" encoding will create the Java string with approriate Unicode values.
btw, the above suggestion assumes the getPropery() is returning the mutli-byte string but not recongnizing the encoding and stores each byte in one Java unicode char. If you are getting question marks already at this point them the above workaround maynot have any effect. Use the code from http://www.i18nfaq.com/java.html#6 to dump the Unicode and see what the values are. If it says \x003F\x003F then your "pocket IE" itself is converting them to question-mark before posting. We may have to research how to make "pocket IE" recognize Big5.
Note: The workaround shown above is no longer needed Instead, use the ServletRequest.setCharacterEncoding method, which was introduced in Servlet 2.3.