Avoid time series looking for std :: map / std :: unordered_map with key std :: string

Consider the following code:

std::map<std::string, int> m1; auto i = m1.find("foo"); const char* key = ... auto j = m1.find(key); 

This will create a temporary std :: string object for each map search. What are the canonical ways to avoid this?

+6
source share
5 answers

Do not use pointers; pass strings directly instead. Then you can use the links:

 void do_something(std::string const & key) { auto it = m.find(key); // .... } 

C ++ usually becomes "more correct", the more you use its idioms and do not try to write C with it.

+3
source

It is impossible to avoid a temporary instance of std::string that copies character data. Please note that this cost is very low and does not require dynamic memory allocation if your standard library implementation uses short string optimizations.

However, if you often need to proxy C-style strings, you can come up with custom solutions that will bypass this distribution. It can pay off if you need to do this very often, and your lines are long enough to not use short line optimizations.

If you need only a very small subset of string functions (for example, only assignment and copies), then you can write a small special string class that stores the const char * pointer and a function to free memory.

  class cheap_string { public: typedef void(*Free)(const char*); private: const char * myData; std::size_t mySize; Free myFree; public: // direct member assignments, use with care. cheap_string ( const char * data, std::size_t size, Free free ); // releases using custom deleter (a no-op for proxies). ~cheap_string (); // create real copies (safety first). cheap_string ( const cheap_string& ); cheap_string& operator= ( const cheap_string& ); cheap_string ( const char * data ); cheap_string ( const char * data, std::size_t size ) : myData(new char[size+1]), mySize(size), myFree(&destroy) { strcpy(myData, data); myData[mySize] = '\0'; } const char * data () const; const std::size_t size () const; // whatever string functionality you need. bool operator< ( const cheap_string& ) const; bool operator== ( const cheap_string& ) const; // create proxies for existing character buffers. static const cheap_string proxy ( const char * data ) { return cheap_string(data, strlen(data), &abandon); } static const cheap_string proxy ( const char * data, std::size_t size ) { return cheap_string(data, size, &abandon); } private: // deleter for proxies (no-op) static void abandon ( const char * data ) { // no-op, this is used for proxies, which don't own the data! } // deleter for copies (delete[]). static void destroy ( const char * data ) { delete [] data; } }; 

Then you can use this class as:

  std::map<cheap_string, int> m1; auto i = m1.find(cheap_string::proxy("foo")); 

A temporary instance of cheap_string does not create a copy of the character buffer, such as std::string , but retains the stored copy semantics for storing cheap_string instances in standard containers.

notes : if your implementation does not use return value optimization, you need to find an alternative syntax for the proxy method, for example, a constructor with special overload (using a special proxy_t type Γ  la std::nothrow to post a new one).

+1
source

Well, the find map actually accepts a permalink to the key, so you cannot avoid creating it at some point.

For the first part of the code, you can have a constant static std :: string with the value "foo" to search for. This way you will not create copies.

If you want to go the Spartan way, you can always create your own type, which can be used as a string, but can also contain a pointer to string literals.

But in any case, the overhead associated with finding maps is so great that it really doesn't make sense. If I were you, I would first replace map / unordered_map with a google thick hash. Then I will launch Intel VTune (amplifier these days) and see where time goes and optimize those places. I doubt that the lines will appear in the list of bottlenecks in the top ten.

0
source

Take a look at the StringRef class from llvm.

They can be built very cheaply from c-strings, string literals or std :: string. If you made a map of these, and not std :: string, the construction will be very fast.

This is a very fragile system. You need to be sure that no matter which line source you insert, it remains alive and unmodified for the life of the card.

0
source

You can avoid the temporary by specifying std::map your own comparator class, which can compare char * s. (The default address is a pointer address that you do not need. You need to compare the string value.)

So something like:

 class StrCmp { public: bool operator () (const char *a, const char *b) { return strcmp(a, b) < 0; } }; // Later: std::map<const char *, int, StrCmp> m; 

Then use as a normal map, but skip char *. Keep in mind that everything you store on the card must remain alive throughout the card . This means that you need char literals, or you need to save the data that the pointer points to by itself. For these reasons, I would go with std::map<std::string> and eat a temporary one until the profiling shows that it is really necessary.

0
source

Source: https://habr.com/ru/post/906724/


All Articles