Why are POT drawings faster than NPOT images?

Question

Why are POT drawings faster than NPOT images?

I was looking for canvas speed optimizations, and I found this answer: https://stackoverflow.com/a/212618/

Do not use images with an odd width. always use width with powers of 2.

So I wonder why this is faster?

I saw posts that explain that this helps with older graphics cards (when using OpenGL, etc.), but I'm talking about speed, not compatibility and the canvas, not OpenGL / WebGL.

+4

performance javascript canvas

Mijyn Jun 14 '13 at 3:32

source share

1 answer

enhzflep · Accepted Answer · 2013-06-14T07:26:22+0000

This is faster because you can use the <operator, not the * oprator. Ie Faster to perform "left shift by 1" (multiply by two) than to perform "muliply by 43". You can get around this limitation by adding padding bytes to the end of each line of the image (as MS did for bitmaps in memory), but essentially this is a consequence of the difference in speed between the two instructions.

In the old days, 8 bits of 320x200 (13h mode) you can index a pixel using a simple formula:

pixOffset = xPos + yPos * 320;

But it was a cloak. A much better alternative was to use

WITH

pixOffset = xPos + (yPos * 256) + (yPos * 64)

Asm

 mov ax, xPos ; ax = xPos mov bx, yPos ; bx = yPos shl bx, 6 ; bx = yPos * 64 add ax, bx ; ax = xPos + (yPos * 64) shl bx, 2 ; bx = yPos * 256 add ax, bx ; ax = xPos + yPos * 320

This may seem contradictory, but when it is well written, it uses only one-time clock instructions. I can calculate the offset of 6 measures. Of course, pipelining and cache problems complicate the scenario.

In addition, it is much cheaper to implement shift registers in equipment than a complete multiplication block, both in $$ and in transistors. Therefore, the same number of transistors can be used to provide better performance, or a smaller number can be used for the same performance with less power dissipation.

AFAIK, the mul (and div) commands of modern processors are implemented using look-up tables. For the most part this mitigates the problem, but it is also not without problems. For further reading, look at the Pentium fdiv error (the error table was mistakenly filled inside the chips).

http://en.wikipedia.org/wiki/Pentium_FDIV_bug

So, in conclusion, it is essentially an artifact of the hardware / software used to implement the functions.

Why are POT drawings faster than NPOT images?

More articles: