C ++ - code runs faster when using Debug runtime library in Visual Studio 2013

tl; dr : Can someone explain the performance differences shown in the table below?

Code setup: There is an array of integers that is populated with values โ€‹โ€‹inside the for loop.
VS project settings:. Two profiles are used (configuration parameters). The first is the default Vacation profile, and the second, call him D_Release , is an exact copy of the Release profile with only one difference; The D_Release profile uses the multi-threaded DLL version of Debug (/ MDd) .

I ran the code for two profiles, and for each profile I saved the array on the heap, stack and bss (so there are 6 different configurations).

The time I measured is as follows:

------+---------+----------- | | /MD | /MDd | |------+---------+-----------| | Heap | 8,5ms | 3,5ms | |------+---------+-----------| | Stack| 3,5ms | 3,5ms | |------+---------+-----------| | bss | 10ms | 10ms | ------+---------+----------- 

[START EDIT]
After some comments, I measured the size of the working set just before the loop and got the following results

  ------+---------+----------- | | /MD | /MDd | |------+---------+-----------| | Heap | 2.23mb | 40.6mb | |------+---------+-----------| | Stack| 40.4mb | 40.6mb | |------+---------+-----------| | bss | 2.17mb | 2.41mb | ------+---------+----------- 

[END EDIT]

When using the multi-threaded Debug (/ MD) DLL (default release profile) and storing the array on the heap, the code is much slower, and in bss I get slow performance in any profile.

Actual question: I find it strange that the Debug dll is faster . Can someone explain the differences in performance?

Additional information: I tried to manually determine and defragment the _DEBUG flag and make sure that the difference is due to using another dll. I also used different timers (for example QueryPerformanceCounter), tried to run executables from VS and from the command line and tried to set _NO_DEBUG_HEAP = 1. I also used aligned malloc (_mm_malloc) to see something changed, but that did not change

Information about VS Runtime Libraries : http://msdn.microsoft.com/en-us/library/2kzt1wy3.aspx

Code used

 #include <iostream> #include <chrono> using std::cout; using std::cerr; using std::endl; typedef std::chrono::high_resolution_clock hclock; #define ALIGNMENT 32 #ifdef _MSC_VER #define ALIGN __declspec(align(ALIGNMENT)) #else #define ALIGN #ifndef _mm_malloc #define _mm_malloc(a, b) malloc(a) #endif #ifndef _mm_free #define _mm_free(a) free(a) #endif #endif #define HEAP 0 #define STACK 1 #define BSS 2 //SWITCH HERE #define STORAGE 0 int main() { const size_t size = 10000000; #if STORAGE == HEAP cout << "Storing in the Heap\n"; int * a = (int*)_mm_malloc(sizeof(int)*size, ALIGNMENT); #elif STORAGE == STACK cout << "Storing in the Stack\n"; ALIGN int a[size]; #else cout << "Storing in the BSS\n"; ALIGN static int a[size]; #endif if ((int)a % ALIGNMENT) { cerr << "Data is not aligned" << endl; } //MAGIC STARTS HERE hclock::time_point end, start = hclock::now(); for (unsigned int i = 0; i < size; ++i) { a[i] = i; } end = hclock::now(); //MAGIC ENDS HERE cout << std::chrono::duration_cast<std::chrono::microseconds>(end - start).count() << " us" << endl; #if STORAGE == HEAP _mm_free(a); #endif getchar(); return 0; } 
+6
source share

Source: https://habr.com/ru/post/980017/


All Articles