I wrote a small sample program in C ++ using boost :: thread. Since it is 215 lines, I sent it to pastebin instead
http://pastebin.com/LRZ24W7D
The program creates a large number of floats (currently 1gb) and adds them first, sequentially, and then uses several threads (located inside the device_matrix class). Assuming the machine is SMP, I expect to see acceleration from the code. And on my Windows machine, I see fourfold acceleration when using 4 instances of device_matrix (giving 4 threads, on my dual-core hyper-thread Intel Core2 CPU). The output on Windows is as follows:
starting computation
device_matrix count 4
elements 268435456
UINT_MAX 4294967295
data size total 1024 mb
size per device_matrix 256 mb
reference 134224128.00000
result 134224128.00000
time taken (init) 12.015 secs
time taken (single) 3.422 secs
time taken (device) 0.859 secs
However, when I compile the same code on an Ubuntu machine, I have access, I see the following output:
starting computation
device_matrix count 8
elements 268435456
UINT_MAX 4294967295
data size total 1024 mb
size per device_matrix 128 mb
reference 134215408.00000
result 134215400.00000
time taken (init) 3.670 secs
time taken (single) 3.030 secs
time taken (threaded) 3.950 secs
( , ).
Ubuntu- uname -a output
Linux gpulab03 2.6.32-23-generic
hwinfo -short :
cpu:
Intel(R) Core(TM) i7 CPU 930 @ 2.80GHz, 1600 MHz
... 7 more times
(, HT)
Windows:
cl /Fe"boost.exe" /EHsc -I. boost.cpp /link /LIBPATH:"C:\boost\boost_1_45_0\stage\lib"
Ubuntu :
g++ -O0 -v -o boost -I$HOME/Code/boost -L$HOME/Code/boost/stage/lib boost.cpp -lboost_thread-gcc44-mt
http://pastebin.com/Gj6W3pcs, .
Linux, , . - , GCC - , -, ?
boost:: thread, -, , -, , - "" ".
, time, ( , boost:: timer ):
real 0m9.788s
user 0m9.500s
sys 0m0.280s
8 :
real 0m7.292s
user 0m10.340s
sys 0m0.340s
, .
, , boost (, , "" Linux.) , , , .. , - ?