You compute value1 - value2 several times in your function. Just do it once.
What applies to uint8_t can also be problematic. As far as performance is concerned, the best integral type to use as a conversion from double to integer is int , since the best integral type to use an array index is int .
max_value = value1; diff = value1 - value2; if (diff < 0.0) { max_value = value2; diff = -diff; } if (diff >= 5.0) { return max_value; } else { return max_value + LUT[(int)(diff * 10.0)]; }
Please note that the above ensures that the LUT index is between 0 (inclusive) and 50 (excluding). There is no need for uint8_t .
Edit
After some games with some variations, this is a fairly quick approximation of LUT to log(exp(value1)+exp(value2)) :
#include <stdint.h> // intptr_t *happens* to be fastest on my machine. YMMV. typedef intptr_t IndexType; double log_sum_exp (double value1, double value2, double *LUT) { double diff = value1 - value2; if (diff < 0.0) { value1 = value2; diff = -diff; } IndexType idx = diff * 10.0; if (idx < 50) { value1 += LUT[idx]; } return value1; }
The integral type IndexType is one of the keys to speeding up the process. I tested with clang and g ++, and both indicated that casting to intptr_t ( long on my computer) and using intptr_t as an index in LUT is faster than other integral types. This is significantly faster than some types. For example, unsigned long long and uint8_t incredibly bad options on my computer.
Type is not just a hint, at least with the compilers used. These compilers did exactly what he said in the code to convert from a floating point type to an integral type, regardless of the level of optimization.
Another speed error is obtained from comparing an integral type with 50, as opposed to comparing a floating point type with 5.0.
One final speed error: not all compilers are created equal. On my computer (YMMV) g++ -O3 generates significantly slower code (25% slower with this problem!) Than clang -O3 , which in turn generates code that is slightly slower than the generated clang -O4 .
I also played with a rational function approximation (similar to Mark Ransom's answer), but the above obviously does not use this approach.