Java: line length when using unicode overline to display square roots?

In Java, I create a string that uses unicode and overline because I'm trying to display the square roots of numbers. I need to know the string length for some formatting problems. When using combined characters in Unicode, the usual string length search methods do not seem to work, as the following example shows. Can someone help me find the length of the second line when random numbers are in square root, or tips on how to make the square root better?

String s = "\u221A"+"12"; String t = "\u221A"+"1"+"\u0305"+"2"+"\u0305"; System.out.println(s); System.out.println(t); System.out.println(s.length()); System.out.println(t.length()); 

Thanks for any help, I could not find anything using Google.

+3
source share
1 answer

conventional string length search methods seem to fail

They fail, the string report is longer than the number of Unicode characters [*]. If you need a different behavior, you need to clearly define what you mean by "string length".

If you are interested in line lengths for showing purposes, then you are usually interested in counting pixels (or some other logical / physical unit) and the responsibility of the display layer (for starters, you may have different widths for different characters, if the font not a monospace).

But if you're just interested in counting the number of graphemes ("the minimum distinguishing unit of a record in the context of a particular system record"), here is a good reference with code and examples. Copy-crop - paste the appropriate code from there, we will have something like this:

  public static int getGraphemeCount(String text) { int graphemeCount = 0; BreakIterator graphemeCounter = BreakIterator.getCharacterInstance(); graphemeCounter.setText(text); while (graphemeCounter.next() != BreakIterator.DONE) graphemeCount++; return graphemeCount; } 

Remember: the above example uses the default locale value. A more flexible and reliable method could, for example, get an explicit locale argument as an argument and instead of BreakIterator.getCharacterInstance(locale)

[*] To be precise, as pointed out in the comments, String.length() counts Java characters, which are actually UTF-16 encoded units. This is equivalent to counting Unicode characters only if we are inside BMP .

+7
source

Source: https://habr.com/ru/post/1437930/


All Articles