Using the OpenMP threadprivate directive for static instances of STL STL types

Consider the following snippet:

#include <map> class A { static std::map<int,int> theMap; #pragma omp threadprivate(theMap) }; std::map<int,int> A::theMap; 

Compilation using OpenMP fails with an error message:

 $ g++ -fopenmp -c main.cpp main.cpp:5:34: error: 'threadprivate' 'A::theMap' has incomplete type 

I do not understand this. I can compile without the #pragma , which should mean that std::map not incomplete. I can also compile if Map is a primitive type (double, int ...).

How to create global static std::map threadprivate ?

+4
source share
3 answers

This is a compiler limitation. The Intel C / C ++ compiler supports C ++ classes on threadprivate , while gcc and MSVC currently cannot.

For example, in MSVC (VS 2010) you get this error (I deleted the class):

 static std::map<int,int> theMap; #pragma omp threadprivate(theMap) error C3057: 'theMap' : dynamic initialization of 'threadprivate' symbols is not currently supported 

So the workaround is pretty obvious, but dirty. You need to create a very simple local storage. Simple approach:

 const static int MAX_THREAD = 64; struct MY_TLS_ITEM { std::map<int,int> theMap; char padding[64 - sizeof(theMap)]; }; __declspec(align(64)) MY_TLS_ITEM tls[MAX_THREAD]; 

Please note that the reason I have the supplement is to avoid a false exchange . I assume a 64-byte cache line for modern Intel x86 processors. __declspec(align(64)) is an extension of MSVC that the structure is on border 64. Thus, any elements in tls will be located in another line of the cache, which will lead to a false exchange. GCC has __attribute__ ((aligned(64))) .

To access this simple TLS, you can do this:

tls[omp_get_thread_num()].theMap;

Of course, you must call this inside one of the parallel OpenMP constructs. It's nice that OpenMP provides an abstract thread identifier in [0, N), where N is the maximum number of threads. This provides a quick and fast implementation of TLS. In general, the native TID from the operating system is an arbitrary integer. Thus, you basically need to have a hash table whose access time is longer than a simple array.

+3
source

An incomplete type error is a compiler error that can be handled by instantiating std::map<int,int> prior to the threadprivate directive. But once you have finished this problem, GCC 4.7 still does not support the dynamic initialization of threadprivate variables. This will be supported in GCC 4.8.

+1
source

Anything threadprivate will be replicated for each thread. I did this by creating a static object (the class does not have to be static, just the instance of the object must be static). Maybe this is what you want?

Now consider if you want some members of the class to be shared between threads. Creating only some members of the static class implies that if each thread created an instance of this object, then we should only replicate the static part (because it is threadprivate), but not the entire object (shared memory is not replicated). This will require one object, so that all and all other objects have a smaller size (without overwriting shared memory), but still have a link to shared memory, which frankly does not make sense.

As a suggestion, make yourself two classes: one with strict (streaming) personal data and one for shared data.

0
source

Source: https://habr.com/ru/post/1380187/


All Articles