I initially compared the performance of inline D arrays and simple pointers, but I had another problem. For some reason, if I run two identical loops one after the other, the second always runs faster.
Here is the code:
import std.stdio : writeln;
import std.datetime : StopWatch;
import core.stdc.stdlib : malloc, free;
void main()
{
immutable N = 1_000_000_000;
StopWatch sw;
uint* ptr = cast(uint*)malloc(uint.sizeof * N);
sw.start();
for (uint i = 0; i < N; ++i)
ptr[i] = 1;
sw.stop();
writeln("the first for loop time: ", sw.peek().msecs(), " msecs");
sw.reset();
sw.start();
for (uint i = 0; i < N; ++i)
ptr[i] = 2;
sw.stop();
writeln("the second for loop time: ", sw.peek().msecs(), " msecs");
sw.reset();
free(ptr);
}
After compiling and starting with, dmd -release -O -noboundscheck -inline test.d -of=test && ./testit prints:
the first for loop time: 1253 msecs
the second for loop time: 357 msecs
I was not sure if this was due to D or dmd, so I rewrote this code in C ++:
#include <iostream>
#include <chrono>
int main()
{
const unsigned int N = 1000000000;
unsigned int* ptr = (unsigned int*)malloc(sizeof(unsigned int) * N);
auto start = std::chrono::high_resolution_clock::now();
for (uint i = 0; i < N; ++i)
ptr[i] = 1;
auto finish = std::chrono::high_resolution_clock::now();
auto milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(finish-start);
std::cout << "the first for loop time: " << milliseconds.count() << " msecs" << std::endl;
start = std::chrono::high_resolution_clock::now();
for (uint i = 0; i < N; ++i)
ptr[i] = 2;
finish = std::chrono::high_resolution_clock::now();
milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(finish-start);
std::cout << "the second for loop time: " << milliseconds.count() << " msecs" << std::endl;
free(ptr);
}
and g++ -O3 test.cpp -o test && ./testgives a similar conclusion:
the first for loop time: 1029 msecs
the second for loop time: 349 msecs
. . , . , , .
, , ?