Fixed Width and Zero-End Strings

gcc 4.4.4 c89

Recently, I talked about "fixed-width strings" and "zero-end strings."

When I think about it. It seems to be the same thing. A string with a terminating zero.

i.e.

char *name = "Joe bloggs"; 

It is a fixed-width string that cannot be changed. It also has a trailing zero.

Also in the discussion I was told that strncpy should never be used for null-terminated strings.

Thanks so much for any glitches,

+3
source share
3 answers

The term "fixed-width string" usually refers to something completely different.

A fixed-width string with N is a string of exactly N characters, where all N characters are guaranteed to be initialized. If you want to represent a shorter string, you must fill your string with null characters at the end. You must add as many null characters as necessary to use all of the N characters. Note: if you need to save a string exactly N long, a fixed-width string will not have a null character at the end. That is, in the general case, fixed-width strings do not have a zero end!

What is the purpose of this? The purpose of this is to save 1 character while keeping the string as long as possible. If you use strings with a fixed width of width N , you need exactly N characters to represent a string of length N Compare this to regular zero-terminated strings, which requires the character N + 1 (an extra character for the null terminator).

Why is it padded with zeros at the end? It is padded with zeros to simplify the lexicographic comparison of fixed-width strings. You simply compare all N characters until you reach the difference. Please note that you can use absolutely any character to fill a string of a fixed width to full length. Just make sure you get the right lexicographic order. However, using a null character to fill is a good choice.

When is this useful? Rarely. The savings provided by fixed-width strings are rarely important when processing string strings: these savings are too small and occur only when the full width is used by the string. But they can be useful in some specific cases.

Where does all this come from? A classic example of a “fixed-width string” is a file name field of 14-char wide in some old version of the Unix file system. It was represented by an array of 14 characters and uses a fixed width representation. At this time, it is important to keep 1 character in a full-sized (all 14 characters) file name.

Now up to strncpy . The strncpy function was specifically introduced to initialize these 14-character file name fields in this file system. The strncpy function was specifically created to create a valid fixed-width string: it converts the null string to a fixed-width string. Unfortunately, it got a misleading name, so many people today take it for a "safe" copy function for zero-terminated strings. The latter is a completely misunderstanding of the purpose and functionality of strncpy .

Using string literals to represent fixed-width strings (as in your example) is not a good idea, since string literals always add a null character at the end, and fixed-width strings do not necessarily do this. This is how a bunch of fixed-width strings can be initialized in C

 char fw_string1[7] = { 'T', 'h', 'i', 's', ' ', 'i', 's' }; char fw_string2[7] = { 's', 't', 'r', 'i', 'n', 'g' }; char fw_string3[7] = { 'H', 'e', 'l', 'l', 'o' }; 

All arrays have the same number of elements - 7. Note that the first line has no zero end, and the rest with zero addition. Converting a “regular” string to a fixed width will look like this

 char fw_string4[7]; strncpy(fw_string4, "Hi!", 7); 

In this case, the strncpy function is used exactly what it is intended to be used for.

Remember also that, besides the strncpy conversion function, the standard library has practically no means for working with fixed-width strings. You should basically treat them like massive arrays of characters and perform any higher-level operations manually. Most of the basic operations will, of course, be implemented by functions from the mem... group. memcmp , for example, implements a comparison.

PS In fact, taking into account the cafe’s comment, in C, you can use string literals to initialize fixed-width strings, since C allows the literal initializer to be one character longer than the array (i.e., in C, this is normal if the terminating zero does not fit into the array). Thus, the above can be equivalently rewritten as

 char fw_string1[7] = "This is"; char fw_string2[7] = "string"; char fw_string3[7] = "Hello"; 

Note that fw_string1 in this case still does not end with zero.

+7
source

First of all, I think you mean a string of a fixed length, not a fixed string.

Secondly, the above is a zero-terminated string. It cannot be changed due to its definition as a constant constant.

AFAIK C does not have real "fixed-length strings." In the best case, you can define a buffer of size N and put in it no more than N-1 characters, where placing a larger number will be an error, and forgetting about a null terminator can be an error.

As for strncpy, it does what it copies the specified number of characters and the rest the rest. This means that if the destination is not long enough, you will either write past the available space or you will not have a zero delimiter for your string, which will lead to errors when trying to use the string.

+1
source

I'm not quite sure about the term "patch width string". Depending on the function lines of C, the ending \ 0 is required or not required. Functions like strlen and strcpy should work with \ 0 terminated strings to know when to stop. Functions such as strncpy do not need the source string to be \ 0-terminated, since a single argument indicates how many characters to copy.

When you declare a name as you do, the contents of the name indicate that it is stored in read-only memory and cannot be changed, however you can use the "name" in C functions that do not change the contents, for example. strlen (name) or when used as source:

 char mycopy[32]; strcpy( mycopy, name ); 
+1
source

Source: https://habr.com/ru/post/1341372/


All Articles