C Strings: simple question

I have three variables initialized below:

char c1[] = "Hello"; char c2[] = { 'H', 'e', 'l', 'l', 'o', '\0'}; char* c3 = "Hello"; 

I know that c1 and c2 are the same and that they are both strings because they end with \ 0. However, c3 is different from c1 and c2. Is this because c3 does not end with \ 0? Does this mean that c3 is not a string? If c3 is not a string, then why printf("%s", c3); does not give an error? Thanks!

EDIT:

Is there a reason c1 and c2 can be changed, but c3 cannot?

+6
source share
9 answers

In terms of C, the most significant difference between c3 and others is that you are not allowed to change the underlying characters with c3 . It is often useful for me to think of it this way:

 char *xyz = "xyz"; 

creates a mutable pointer on the stack and makes it a point in an immutable sequence of characters {'x','y','z','\0'} . On the other hand,

 char xyz[] = "xyz"; 

will create a modifiable array on the stack large enough to hold the sequence of characters {'x','y','z','\0'} , and then copy this sequence of characters into it. Then the contents of the array can be modified. Keep in mind that the standard does not say anything about stacks, but this is usually done. In the end, it's just a memory.

Formally, c3 is a pointer to a string literal, and c1 and c2 are both arrays of characters that end with a null character. When they are passed to functions such as printf , they decay to a pointer to the first element of the array, which means that in this function they will be processed the same way as c3 (in fact, they decay under quite a lot of circumstances, see the third quote from c99 below for exceptions) .

The relevant sections of C99 are 6.4.5 String literals , which explain why you are not allowed to change what c3 points to:

It is not indicated whether these arrays are different if their elements have corresponding values. If the program tries to change such an array, the behavior is undefined.

and why it has a null limiter:

In phase 7 of the translation, a byte or null code is added to each multibyte character sequence that is obtained from a string literal or literals.

And 6.3.2.1 Lvalues, arrays, and function designators in 6.3 Conversions say:

Unless it is an operand of the sizeof or unary operator and the operator, or is a string literal used to initialize an array, an expression that is of type type '' type is converted to an expression with a type pointer '' for input, which points to the source element of an array object and is not an lvalue value. If the array object has a register storage class, the behavior is undefined.

+10
source

First point

 char* c3 = "Hello"; // may be valid C, but bad C++! 

is an error prone style, so do not use it. Use instead

 const char* c3 = "Hello"; 

This is a valid code. The c3 pointer points to the address of the place where the string "Hello" is stored. But you cannot change *c3 (i.e. c3 Content) as earlier cases (if you do this, this behavior is undefined).

+5
source

c3 is a pointer to a string, which printf("%s", ...) is expected as an argument.

The reason printf("%s", c1) or printf("%s", c2) will also work is because arrays are very easily expressed in expressions if the C array decays. In fact, only if the array name does not split into a pointer in the expression, is it used as the operand for the sizeof operator or the operand of the & (address) operator.

This leads to a general confusion that pointers and arrays are equivalent in C, which is not true. It's just that pointers can be used almost everywhere in C arrays. The only exception is that they cannot be assigned unless they are indexed (which turns out to be an expression that treats them as a pointer).

Note that there is another difference in the last line - since it is a literal with letters, it cannot be changed (it is undefined, which will happen if you try). `

+3
source

c1 and c2 allocate 6 bytes of memory and store a zero-terminated string in it.

c3 , however, selects a (also zero) line in the program memory and creates a pointer for it, i.e. the line is saved along with other instructions, and not on the stack (or a bunch? me), so editing would be unsafe.

+1
source

In C, the constant "string" can have two values, depending on what context is used. It can either indicate a line in the ro section of the executable file (although I don't think these are standard spells) by making the const char *foo = "bar" foo initialization instruction to indicate where the loaded executable memory is located. If the binary blob ( "bar" ) is really in the ro section, and you do something like foo[0] = 'x' , you will get SIGSEGV .

However, when you write char x[] = "Hello" (or char x[6] = "Hello" ), you use "Hello" as an initializer for the array (for example, int x[2] = { 1, 2 } ), and x is just an ordinary (writable) array allocated on the stack. In this case, "Hello" is simply an abbreviation for {'H', 'e', 'l', 'l', 'o', '\0' } .

Both "bar" and "Hello" have zero termination.

+1
source

c3 not interrupted by NUL or NULL. This is a pointer to a string terminated by NUL.

0
source

This is a string. It points to a string, but it is risky.

0
source

This is a pointer to a string with different endings.

0
source

C3 - pointer to the first cell in the row. C1, C2 is just a regular array that no one pointed out.

0
source

Source: https://habr.com/ru/post/891417/


All Articles