Is there any benefit from using a dual processor on a 64-bit (and, say, a floating processor)?

I always use double to do the calculations, but double offers much better accuracy than I need (or it makes sense, given that most of the calculations I do are approximations to start with).

But since the processor is already 64-bit, I do not expect that using a type with fewer bits will have any benefit.

I'm right / wrong, how would I optimize speed (I understand that smaller types will be more memory efficient)

here is a test

#include <cmath> #include <ctime> #include <cstdio> template<typename T> void creatematrix(int m,int n, T **&M){ M = new T*[m]; T *M_data = new T[m*n]; for(int i=0; i< m; ++i) { M[i] = M_data + i * n; } } void main(){ clock_t start,end; double diffs; const int N = 4096; const int rep =8; float **m1,**m2; creatematrix(N,N,m1);creatematrix(N,N,m2); start=clock(); for(int k = 0;k<rep;k++){ for(int i = 0;i<N;i++){ for(int j =0;j<N;j++) m1[i][j]=sqrt(m1[i][j]*m2[i][j]+0.1586); } } end = clock(); diffs = (end - start)/(double)CLOCKS_PER_SEC; printf("time = %lf\n",diffs); delete[] m1[0]; delete[] m1; delete[] m2[0]; delete[] m2; getchar(); } 

there was no time difference between double and float, however, when the square root is not used, float is twice as fast.

+2
source share
1 answer

There are several ways they can be faster:

  • Faster I / O: you only have half a bit to move between disk / memory / cache / registers
  • As a rule, the only slow operations are square roots and division. As an example, Haswell a DIVSS (float division) takes 7 cycles, while a DIVSD (double division) takes 8-14 (source: Agner Fog Tables ).
  • If you can use the SIMD instructions, you can process twice as much for each command (i.e., in the 128-bit SSE register, you can work with 4 floats, but only 2 are doubled).
  • Special functions ( log , sin ) can use lesser polynomials: for example. the openlibm log implementation uses a polynomial of degree 7, while logf requires only degree 4.
  • If you need higher intermediate precision, you can simply advance the float to double , whereas for double you need either software with double double or slower long double .

Please note that these points are also preserved for 32-bit architectures: unlike integers, there is nothing special in the fact that the size of the format matches your architecture, that is, on most machines it doubles as "native" as floats .

+4
source

Source: https://habr.com/ru/post/1241243/


All Articles