java – Setting Turkish and English locale: translate Turkish characters to Latin equivalents-ThrowExceptions

Exception or error:

I want to translate my Turkish strings to lowercase in both English and Turkish locale. I’m doing this:

String myString="YAŞAT BAYRI";
Locale trlocale= new Locale("tr-TR");
Locale enLocale = new Locale("en_US");

Log.v("mainlist", "en source: " +myString.toLowerCase(enLocale));
Log.v("mainlist", "tr source: " +myString.toLowerCase(trlocale));

The output is:

en source: yaşar bayri

tr source: yaşar bayri

But I want to have an output like this:

en source: yasar bayri

tr source: yaşar bayrı

Is this possible in Java?

How to solve:

If you are using the Locale constructor, you can and must set the language, country and variant as separate arguments:

new Locale(language)
new Locale(language, country)
new Locale(language, country, variant)

Therefore, your test program creates locales with the language “tr-TR” and “en_US”. For your test program, you can use new Locale("tr", "TR") and new Locale("en", "US").

If you are using Java 1.7+, then you can also parse a language tag using Locale.forLanguageTag:

String myString="YASAT BAYRI";
Locale trlocale= Locale.forLanguageTag("tr-TR");
Locale enLocale = Locale.forLanguageTag("en_US");

Creates strings that have the appropriate lower case for the language.


I think this is the problem:

Locale trlocale= new Locale("tr-TR");

Try this instead:

Locale trlocale= new Locale("tr", "TR");

That’s the constructor to use to specify country and language.


you can do that:

Locale trlocale= new Locale("tr","TR");

The first parameter is your language, while the other one is your country.


If you just want the string in ASCII, without accents, the following might do.
First an accented character might be split in ASCII char and a combining diacritical mark (zero-width accent). Then only those accents may be removed by regular expression replace.

public static String withoutDiacritics(String s) {
    // Decompose any ş into s and combining-,.
    String s2 = Normalizer.normalize(s, Normalizer.Form.NFD);
    return s2.replaceAll("(?s)\\p{InCombiningDiacriticalMarks}", "");


Characters ş and s are different characters. Changing locale cannot help you to translate one to another. You have to create turkish-to-english characters table and do this yourself. I once did this for Vietnamic language that has a lot of such characters. You have to deal with 4 of 5, right? So, good luck!

Leave a Reply

Your email address will not be published. Required fields are marked *