ToLowerCase () in Java when used with Locale does not give exact result

Take a look at the following code snippet in Java.

final public class Main { public static void main(String[] args) { Locale.setDefault(new Locale("lt")); String str = "\u00cc"; //setting Lithuanian as locale System.out.println("Before case conversion is "+str+" and length is "+str.length());// Ì String lowerCaseStr = str.toLowerCase(); System.out.println("Lower case is "+lowerCaseStr+" and length is "+lowerCaseStr.length());// i?` } } 

It displays the following output.

Before transforming the case Ì and length 1

Lowercase is i ̇̀ and length is 3


In the first statement, System.out.println() result is accurate. However, the second expression displays the length 3 , which should actually be 1 . I do not understand why?

+4
source share
3 answers

Different languages ​​have different rules for converting to upper or lower case.

For example, in German, the lower case ß becomes the two upper case S, so the word "straße" (street) of 6 characters becomes "STRASSE" with a length of 7 characters.

This is why your stitches with top and bottom tip have different lengths.

I wrote about this in one of my Java tests: http://thecodersbreakfast.net/index.php?post/2010/09/24/Java-Quiz-42-%3A-A-string-too-far

+5
source

I get a different result:

 Before case conversion is Ì and length is 1 Lower case is i?? and length is 3 
+1
source

This is quite a duplicate. Does Java toLowerCase () keep the original string length? . This is very helpful and has a very detailed answer. the length of str and str.toLowerCase () are not always the same, since the conversion depends on the code of each char.

In this case, the second output is "Lower case - i?? , and the length is 3". is he hooked by two? mark so that the length is 3.

+1
source

Source: https://habr.com/ru/post/1385951/


All Articles