Unicode strings in .Net with Hebrew letters and numbers

When you try to create a string containing a Hebrew letter and number, strange behavior occurs. The number will always be displayed to the left of the letter. For instance:

string A = "\u05E9"; //A Hebrew letter string B = "23"; string AB = A + B; textBlock1.Text = AB; //Ouput bug - B is left to A. 

This error occurs only when using Hebrew and numbers. When disabling one of these equations, the error will not be:

 string A = "\u20AA"; //Some random Unicode. string B = "23"; string AB = A + B; textBlock1.Text = AB; //Output OK. string A = "\u05E9"; //A Hebrew letter. string B = "HELLO"; string AB = A + B; textBlock1.Text = AB; //Output OK. 

I tried playing with the FlowDirection property, but that didn't help.

A workaround for the correct display of text in the first revision of the code will be welcome.

+6
source share
4 answers
 string A = "\u05E9"; //A Hebrew letter string B = "23"; string AB = B + A; // ! textBlock1.Text = AB; textBlock1.FlowDirection = FlowDirection.RightToLeft; //Ouput Ok - A is left to B as intended. 
0
source

For this purpose, Unicode characters "RTL label" (U + 200F) and "LTR mark" (U + 200E) were created.

In your example, just put the LTR label after the Hebrew symbol, and the numbers will be displayed to the right of the Hebrew symbol, as you wish.

So, your code will be adjusted as follows:

 string A = "\u05E9"; //A Hebrew letter string LTRMark = "\u200E"; string B = "23"; string AB = A + LTRMark + B; 
+12
source

This is due to Unicode bidirectional algorithms . If I understand this correctly, the Unicode character has an "identifier" that says where it should be when it is next to another word.

In this case, \u05E9 says that it should be on the left. Even if you do this:

var ab = string.Format("{0}{1}", a, b);

You still get it on the left. However, if you take another unicode character, for example \u05D9 , it will be added to the right, because this character is not indicated on the left.

This is a language layout, and when you issue this layout layout, it will display it according to the language layout.

+4
source

This strange Behavior has an explanation. Digits with unicode characters are treated as part of a unicode string. and, since Hebrew is read from right to left, the script will give

 string A = "\u05E9"; //A Hebrew letter string B = "23"; string AB = A + B; 

B and then A

second scenario:

 string A = "\u20AA"; //Some random Unicode. string B = "23"; string AB = A + B; 

A is some unicode, not part of lang, which is read from right to left . therefore, the output is first A followed by B

now consider my own script

 string A = "\u05E9"; string B = "\u05EA"; string AB = A + B; 

both A and B are part from right to left, read lang, so AB is B , followed by A not A and then B

EDITED to reply to comment

given this scenario -

 string A = "\u05E9"; //A Hebrew letter string B = "23"; string AB = A + B; 

The only solution to get a letter followed by a digit is as follows: string AB = B + A;

not a solution that will work as a whole. So, I think you need to implement some validation conditions and build the string as required.

0
source

Source: https://habr.com/ru/post/892120/


All Articles