If you are interested in working with internal components, I would parse the instruction to get the CPython bytecode to which it is bound. Using Python3:
»»» def test(): return 2**3 ...: »»» dis.dis(test) 2 0 LOAD_CONST 3 (8) 3 RETURN_VALUE
OK, so it seems that I did the calculation correctly as I entered and saved the result. You get exactly the same CPython bytecode for 2 * 2 * 2 (feel free to try). So, for expressions that are evaluated by a constant, you get the same result, and that doesn't matter.
What if you need the power of a variable?
Now you get two different bits of bytecode:
»»» def test(n): return n ** 3 »»» dis.dis(test) 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (3) 6 BINARY_POWER 7 RETURN_VALUE
vs.
»»» def test(n): return n * 2 * 2 ....: »»» dis.dis(test) 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (2) 6 BINARY_MULTIPLY 7 LOAD_CONST 1 (2) 10 BINARY_MULTIPLY 11 RETURN_VALUE
Now, of course, the BINARY_MULTIPLY question is faster than the BINARY_POWER operation?
The best way to try is to use timeit. I will use the magic of IPython %timeit . Here is the output for multiplication:
%timeit test(100) The slowest run took 15.52 times longer than the fastest. This could mean that an intermediate result is being cached 10000000 loops, best of 3: 163 ns per loop
and for power
The slowest run took 5.44 times longer than the fastest. This could mean that an intermediate result is being cached 1000000 loops, best of 3: 473 ns per loop
You might want to repeat this for typical inputs, but empirically it looks like multiplication is faster (but pay attention to the cited output variance clause).
If you need further internal functions, I would suggest delving into CPython code.