Swapping a character buffer between C # string objects

Is it possible? Given that C # uses immutable strings, one would expect that there would be a method for strings:

var expensive = ReadHugeStringFromAFile();
var cheap = expensive.SharedSubstring(1);

If there is no such function, why worry about strings being immutable? Or, alternatively, if the strings are already immutable for other reasons, why not provide this method?

The specific reason I'm doing this is to parse the files. Simple recursive descent analyzers (such as those created by TinyPG or easily written by hand) use a substring all over the place. This means that if you give them a large file to parse, disabling memory is unbelievable. Of course, there are workarounds - basically roll up your own SubString class, and then, of course, forget about the possibility of using String methods such as StartsWith or String, such as Regex, so you also need to flip your own version. I guess parser generators like ANTLR basically do this, but my format is simple enough to not justify using such a monster. Even TinyPG is probably redundant.

Someone please tell me that I missed some obvious or not so obvious standard C # method call somewhere ...

+3
source share
6 answers

No, nothing like that.

.NET strings contain their text data directly, unlike Java strings, which have a reference to a char array, offset and length.

Both solutions have gains in some situations and losses in others.

If you are absolutely sure that this will be a killer for you, you can implement a Java style string for use in your own internal APIs.

+5
source

As far as I know, all strong parsers use streams for analysis. Isn't that right for your situation?

+2

.NET framework string interning. , . , , obviouse . , StringBuilder, .

+1

# .

, , , O (1) concats O (log n) . - # , Java .

, TinyPG ANTLR, .

0

, "" , , . StringBuilder , , .

0

"". . - ,

string text = myCheapObject;

and it will work without problems, as if it were a real line. Adding support for several convenient methods, such as StartsWith, would be quick and easy (they would all be one liner).

Another option is to write a regular parser and save your tokens in a dictionary from which you share links to tokens, not multiple copies.

0
source

Source: https://habr.com/ru/post/1710576/


All Articles