The difference in writing a line vs. char using System.IO.BinaryWriter

I am writing text in a binary file in C # and I see the difference in the amount written between writing a string and an array of characters. I use System.IO.BinaryWriter and observe BinaryWriter.BaseStream.Length as entries. These are my results:

using(BinaryWriter bw = new BinaryWriter(File.Open("data.dat"), Encoding.ASCII)) { string value = "Foo"; // Writes 4 bytes bw.Write(value); // Writes 3 bytes bw.Write(value.ToCharArray()); } 

I do not understand why line overload writes 4 bytes when Im writing only 3 ASCII characters. Can anyone explain this?

+4
source share
4 answers

The documentation for BinaryWriter.Write(string) states that it writes a string with a length prefix to this stream. Overloading for Write(char[]) does not have such a prefix.

It seems to me that the extra data is the length.

EDIT:

To be more explicit, use a Reflector. You will see that it has this piece of code as part of the Write(string) method:

 this.Write7BitEncodedInt(byteCount); 

This is a method of encoding an integer using the least possible number of bytes. For short lines (which we will use every day, which are less than 128 characters), it can be represented with a single byte. For longer strings, it starts using more bytes.

Here is the code for this function just in case you are interested:

 protected void Write7BitEncodedInt(int value) { uint num = (uint) value; while (num >= 0x80) { this.Write((byte) (num | 0x80)); num = num >> 7; } this.Write((byte) num); } 

After a length prefix using this encoding, it writes bytes for characters in the desired encoding.

+13
source

From BinaryWriter.Write(string) docs :

Writes the string length-prefixed to this stream in the current BinaryWriter encoding and advances the current position of the stream in accordance with the encoding used and the specific characters written to the stream.

This behavior is likely so that when reading a file using BinaryReader string can be identified. (for example, 3Foo3Bar6Foobar can be parsed into the string "Foo", "Bar" and "Foobar", but FooBarFoobar cannot be.) In fact, BinaryReader.ReadString uses exactly this information to read a string from a binary file.

From BinaryWriter.Write(char[]) docs :

Writes an array of characters to the current stream and advances the current position of the stream in accordance with the encoding used and the specific characters written to the stream.

It is hard to overestimate how comprehensive and useful MSDN documents are. Always check them out.

+5
source

As already mentioned, BinaryWriter.Write (String) writes the length of the string to the stream before writing the string itself.

This allows BinaryReader.ReadString () to know how long the string takes.

 using (BinaryReader br = new BinaryReader(File.OpenRead("data.dat"))) { string foo1 = br.ReadString(); char[] foo2 = br.ReadChars(3); } 
+1
source

Have you looked at what is actually written? I would suggest a null terminator.

0
source

Source: https://habr.com/ru/post/1286433/


All Articles