Floating point precision
C ++ 11 includes the definition of FLT_EVAL_METHOD from C99 to cfloat .
FLT_EVAL_METHOD
Possible values:
-1 undetermined
0 evaluate just to the range and precision of the type
1 evaluate float and double as double, and long double as long double.
2 evaluate all as long double
If your compiler defines FLT_EVAL_METHOD as 2, then the calculations of r1 and r2 , as well as s1 and s2 below are equivalent:
double var3 = …; double var4 = …; double r1 = var3 * var4; double r2 = (long double)var3 * (long double)var4; long double s1 = var3 * var4; long double s2 = (long double)var3 * (long double)var4;
If your compiler defines FLT_EVAL_METHOD as 2, then in all four calculations above, the multiplication is performed with precision of the type long double .
However, if the compiler defines FLT_EVAL_METHOD as 0 or 1, r1 and r2 , respectively s1 and s2 , are not always the same. Multiplications in the calculations of r1 and s1 are performed with precision double . Multiplications in the calculations of r2 and s2 are performed with the accuracy of long double .
Getting broad results from narrow arguments
If you compute results that are intended to be stored in a wider type of result than the type of operands, like result1 and result2 in your question, you should always convert the arguments to type at least as much as the goal, as you are here:
result2=(long double)var3*(long double)var4;
Without this conversion (if you write var3 * var4 ), if the compiler definition FLT_EVAL_METHOD is 0 or 1, the product will be calculated with double precision, which is a shame, because it is intended to be stored in a long double .
If the compiler defines FLT_EVAL_METHOD as 2, then conversions to (long double)var3*(long double)var4 not needed, but they also do not hurt: the expression means exactly the same with them and without them.
Digression: if the destination format is as narrow as the arguments, then when is extended accuracy better for intermediate results?
Paradoxically, for one operation, it is best to round only once to the target accuracy. The only effect of calculating a single multiplication in extended precision is that the result will be rounded to extended precision, and then to double precision. This makes it less accurate . In other words, with FLT_EVAL_METHOD 0 or 1, the result of r2 above is sometimes less accurate than r1 due to double rounding, and if the compiler uses the IEEE 754 floating point, it is never better.
The situation is different for larger expressions that contain multiple operations. It is usually better for them to calculate intermediate results with extended precision, either using explicit conversions or because the compiler uses FLT_EVAL_METHOD == 2 . This question and its accepted answer show that when computing with 80-bit intermediate calculations with high precision for binary 64 arguments and IEEE 754 results, the interpolation formula u2 * (1.0 - u1) + u1 * u3 always gives the result between u2 and u3 for u1 between 0 and 1. This property may not be performed for intermediate calculations with binary 64 precision due to large rounding errors.