Introduction to I18N for QA Engineers

Introduction to software internationalization for QA Engineers


What is I18N/CodePage/locale etc ?
What is internationalization QA ?
What is localization QA ?
I know only English. Can I still do internationalization QA ?
Any easy way to break the software in Asian languages ?
What are the main areas of focus for testing a software product for i18n

What is I18N/CodePage/locale etc ?

Refer question 1-8 in developer section.

What is internationalization QA ?

QA with a specific focus on testing the product for language compatibility. This includes testing the product behavior in identifying and initializing from its language environment and ability to customize to that environment.

The white box testing would typically include checking the code for i18n compatibility standards (E.g. using the correct API etc.). The black box testing typically include running the whole functionality regression test on different language environment and exercising the interface with native language strings. The cultural specific information's (Like date/time display) needs to be checked as well.

What is localization QA ?

Localization QA is done after the final localization/translation of the software. The emphasis is not more on the functionality, but on checking the appropriateness of the translation in that GUI context. Also involves checking the GUI layout and making sure nothing is truncated etc. Its typically done by the native speaker of that specific language.

I know only English. Can I still do internationalization QA ?

Absolutely. The i18n part of the QA requires no knowledge on the specific language, even though knowing the language will come in handy. Esp. while setting up the language environment. When you do the i18n QA your products is not translated yet, so your product interface will still display in English. There could be some error messages propagated from the environment that might confuse you. But usually the error numbers will help you lookup in the English error message repository.

Any easy way to break the software in Asian languages ?

Many of the Asian code pages have multi-byte characters that have the second byte in the
ASCII range. These characters are known to break the products.
Here if one senario...

char *str = "ABCDEF\XYZ" ;
Byte values...
ABCDEF\XYZ
0x410x420x430x440x95,0x5C0x450x460x5C0x580x590x5A

Notice the two bytes with values 0x5C. One is the second byte of the Japanese character and the other is the codepoint of backslash.
Say you want to separate this string at backslash. A non i18n compatible code like below...


char *ptr = src ;
while(*ptr++ == '\') { ...

will stop at the second byte of the Japanese character and any string truncation after that would corrupt your string. So if you use any of the troublesome character (Use link below) you have more chances of breaking the system.

What are the main areas of focus for testing a software product for i18n

1) Messages: All text/images are externalized. Testing
could be a combination of code scanning tools to look
for hard coded text and pseudo translate the whole
product and manually go through each GUI to check if
you see any English.

2) Date/Time format. After you change the locale of
the software the date/time format changes according to
the locale. Identify all the date/time display, change
locale and test one by one.

3) Charset/Encoding: Looks for communication layers in
the product and run specific tests passing non English
data through the channel and look for garbage on the
other side. Can use automated scripts for these tests.

4) GUI Layout - Most common problem is the text
truncation. Pick a language like German which is know
for long text and do a pseudo translation (Use Google
translation tool, or babblefish) and go through the
GUI to check for any truncation. On the code level you
can also check all the string buffer are big enough
and the UI space of each components is 30% bigger to
accommodate long text.

5) Sorting: Make sure the sorting on any list is as
following locale collation algorithm.

6) Depending on the product other locale sensitive
formatting like Currency, Numeric and address etc.