What is the correct way to handle char * strings?

I have a third-party library that uses char * (not const) as a placeholder for string values. What is the correct and safe way to assign values ​​to these data types? I have the following test test that uses my own timer class to measure runtime:

#include "string.h" #include <iostream> #include <sj/timer_chrono.hpp> using namespace std; int main() { sj::timer_chrono sw; int iterations = 1e7; // first method gives compiler warning: // conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings] cout << "creating c-strings unsafe(?) way..." << endl; sw.start(); for (int i = 0; i < iterations; ++i) { char* str = "teststring"; } sw.stop(); cout << sw.elapsed_ns() / (double)iterations << " ns" << endl; cout << "creating c-strings safe(?) way..." << endl; sw.start(); for (int i = 0; i < iterations; ++i) { char* str = new char[strlen("teststr")]; strcpy(str, "teststring"); } sw.stop(); cout << sw.elapsed_ns() / (double)iterations << " ns" << endl; return 0; } 

Output:

 creating c-strings unsafe(?) way... 1.9164 ns creating c-strings safe(?) way... 31.7406 ns 

While a β€œsafe” way to get rid of a compiler warning, it makes the code about 15-20 times slower according to this standard (1.9 nanoseconds per iteration versus 31.7 nanoseconds per iteration). What is the correct method and what is dangerous for this "obsolete" method?

+6
source share
3 answers

The C ++ standard is clear:

A regular string literal is of the type "array of n const char" (section 2.14.5.8 in C ++ 11).

and

The effect of trying to change an undefined string literal (section 2.14.5.12 in C ++ 11).

For a string known at compile time, a safe way to get non-const char* is this

 char literal[] = "teststring"; 

you can safely

 char* ptr = literal; 

If at compile time you don't know the string, but you know its length, you can use an array:

 char str[STR_LENGTH + 1]; 

If you do not know the length, you will need to use dynamic allocation. Make sure you free up memory when strings are no longer needed.

This will only work if the API does not get ownership of the char* you are passing.

If he tries to free the lines inside, then he should say this in the documentation and tell you about the correct way to select lines. You will need to map the selection method to the one used inside the API.

 char literal[] = "test"; 

will create a local array of 5 characters with automatinc memory (this means that the variable will be destroyed when execution leaves the area in which the variable is declared) and initialize each character in the array with the characters' t ',' e ',' s', ' t 'and' \ 0 '.

You can subsequently edit these characters: literal[2] = 'x';

If you write this:

 char* str1 = "test"; char* str2 = "test"; 

then, depending on the compiler, str1 and str2 can be of the same value (i.e. point to the same line).

("Regardless of whether all string literals are different (that is, stored in objects that do not support overlapping), it is determined by the implementation." In section 2.14.5.12 of the C ++ standard)

It may also be true that they are stored in a read-only memory section, and therefore any attempt to change the line will result in an exception / failure.

They are also, in fact, of type const char* , so this line:

char * str = "test";

actually discards the constant in the string, so the compiler will give a warning.

+10
source

An unsafe path is a path for all lines that are known at compile time.

Your "safe" way of memory leak and quite horrific.

Usually you have a robust C API that accepts const char * , so you can use the right safe way in C ++, i.e. std::string and its c_str() method.

If your C API assumes that it belongs to a string, your "safe path" has another drawback: you cannot mix new[] and free() by passing memory allocated using the C ++ new[] operator to the C API, which It is expected that calling free() on it is not allowed. If the C API does not want to call free() later on the line, it should be good to use new[] on the C ++ side.

It's also a weird mix of C ++ and C.

+5
source

You seem to have a fundamental misunderstanding about C lines here.

 cout << "creating c-strings unsafe(?) way..." << endl; sw.start(); for (int i = 0; i < iterations; ++i) { char* str = "teststring"; } 

Here you simply assign a pointer to a string literal constant. In C and C ++, string literals are of type char[N] , and you can assign a pointer to an array of string literals because of the "decay" array. (However, it is deprecated to assign a non-constant pointer to a string literal.)

But assigning a pointer to a string literal cannot be what you want to do. Your API expects a non-constant string. String literals const .

What is the correct and safe way to assign values ​​to these [char * strings]?

There is no general answer to this question. Whenever you work with C strings (or pointers in general), you need to deal with the concept of ownership. C ++ will take care of this for you automatically using std::string . Inside std::string has a pointer to a char* array, but it manages the memory for you, so you don't need to worry about that. But when you use raw C-strings, you need to think about memory management.

How you manage memory depends on what you do with your program. If you highlight a C-string with new[] , you need to free it with delete[] . If you select it with malloc , you must free it with free() . A good solution for working with C-lines in C ++ is to use a smart pointer that takes responsibility for the selected C line. (But you need to use deleter , which frees memory using delete[] ). Or you can just use std::vector<char> . As always, be sure to allocate space for the terminating char.

Also, the reason your second loop is much slower is because it allocates memory at each iteration, while the first loop just assigns a pointer to a statically allocated string literal.

+4
source

Source: https://habr.com/ru/post/944118/


All Articles