How about the macro shell having an __restrict effect at compile time : (below pseudo code, not checked):
Now the intermediate method is defined as
inline void Multiply_restrict(const MatrixMN* __restrict pA, const MatrixMN* __restrict pB, MatrixMN* __restrict pC) { Multiply_(*pA, *pB, *pC); }
And finally, just add _ after the original Multiply :
void Mutliply_(const MatrixMN& a, const MatrixMN& b, MatrixMN& out);
Thus, the final effect will be the same as you call:
Multiply(x, y, answer);
source share