Why are some identical lines not interned in .NET?

string s1 = "test"; string s5 = s1.Substring(0, 3)+"t"; string s6 = s1.Substring(0,4)+""; Console.WriteLine("{0} ", object.ReferenceEquals(s1, s5)); //False Console.WriteLine("{0} ", object.ReferenceEquals(s1, s6)); //True 

Both lines s5 and s6 have the same meaning as s1 ("test"). Based on the concept of string interning, both statements should be evaluated as true. Can someone explain why s5 did not have the same link as s1?

+3
source share
5 answers

You should get false for calling ReferenceEquals on string objects that are not string literals.

Essentially, the last line prints True by coincidence: what happens is that when passing an empty string to concatenate strings, the library optimization recognizes this and returns the original string. This has nothing to do with interning, as the same thing will happen with lines that you read from the console or construct in any other way:

 var s1 = Console.ReadLine(); var s2 = s1+""; var s3 = ""+s1; Console.WriteLine( "{0} {1} {2}" , object.ReferenceEquals(s1, s2) , object.ReferenceEquals(s1, s3) , object.ReferenceEquals(s2, s3) ); 

Prints above

 True True True 

Demo version

+8
source

CLR does not put all lines. All string literals are interned by default. However, the following:

 Console.WriteLine("{0} ", object.ReferenceEquals(s1, s6)); //True 

Returns true since the line is here:

 string s6 = s1.Substring(0,4)+""; 

Effectively optimized to bring the same link back. This happens (most likely) interned, but it is an accident. If you want the string to be interned, you should use String.IsInterned ()

If you want to set strings at runtime, you can use String.Intern and store the link according to the MSDN documentation here: String.Intern Method (String) . However, I strongly recommend that you do not use this method if you have no reason for this: it has performance considerations and potentially unwanted side effects (for example, strings that were interned cannot be garbage collected).

+2
source

The Substring method is smart enough to return the original string when the requested substring is the original string. Link to a reference source found in a comment by @ DanielA.White. So s1.Substring(0,4) returns s1 when s1 has length 4. And, apparently, the + operator has a similar optimization, so that

 string s6 = s1.Substring(0,4)+""; 

functionally equivalent to:

 string s6 = s1; 
+1
source

Strings in .NET can be interned. Nowhere is it said that two identical lines must be the same instance of the line. As a rule, the compiler will put identical string literals, but this is not true for all lines and, of course, does not apply to lines created dynamically at runtime.

+1
source

From the msdn documentation of the .ReferenceEquals object here :

When comparing strings. If objA and objB are strings, the ReferenceEquals method returns true if the string is interned. It does not check for equality of values. In the following example, s1 and s2 are equal since they are two instances of the same interned string. However, s3 and s4 are not equal, because although they have the same string values, this string is not interned.

 using System; public class Example { public static void Main() { String s1 = "String1"; String s2 = "String1"; Console.WriteLine("s1 = s2: {0}", Object.ReferenceEquals(s1, s2)); Console.WriteLine("{0} interned: {1}", s1, String.IsNullOrEmpty(String.IsInterned(s1)) ? "No" : "Yes"); String suffix = "A"; String s3 = "String" + suffix; String s4 = "String" + suffix; Console.WriteLine("s3 = s4: {0}", Object.ReferenceEquals(s3, s4)); Console.WriteLine("{0} interned: {1}", s3, String.IsNullOrEmpty(String.IsInterned(s3)) ? "No" : "Yes"); } } // The example displays the following output: // s1 = s2: True // String1 interned: Yes // s3 = s4: False // StringA interned: No 
+1
source

Source: https://habr.com/ru/post/1266287/


All Articles