Optimization y = x * x in Galois field arithmetic

I have this C code to multiply by GF (8):

int32_t GaloisMultiply (int32_t a, int32_t b) { int32_t i; int32_t mask = 0x100; int32_t y = 0; for(i=0;i<8;i++) { if(b & mask) { y ^= a; } mask >>= 1; y <<= 1; } if(b & 0x1) { y ^= a; } return(y); } 

This is more or less a textbook implementation.

I wonder if I have smart optimization for the above algorithm, if I can argue that a is always b, for example. I do squares instead of multiplication. I am not after cryptographic use by the way. I just want to use the fact that x * x in GF (8) interleaves the x bits with zero bits one by one.

There are already pretty clever methods for performing bit interleaving, but since I found that x * x in GF (8) does the bit interleaving operation (by accident), I can't stop trying to use it to optimize bit interleaving.

Any ideas?

+4
source share
6 answers

Based table? link

And when you are bounded by x * x, this is a sparse matrix.

Here's another good paper (and library)

+4
source
 int32_t GaloisMultiply( int32_t a ) { int32_t y = 0; int32_t b = a & 0x01ff; while ( b ) { if ( b & 1 ) y ^= a; a <<= 1; b >>= 1; } return y; } 

Or if you want:

 int32_t GaloisMultiply( int32_t a ) { int32_t y = 0; for ( int32_t b = a & 0x01ff; b; b >>= 1 ) { if ( b & 1 ) y ^= a; a <<= 1; } return y; } 

The reason this approach is more efficient than the source code above is primarily because the loop only runs until all the “interesting” bits in the argument are destroyed, rather than blindly checking all (9) bits.

The desktop approach will be faster though.

+1
source

The lookup table is by far the fastest for quadratic squares. It is also the fastest to multiply when using GF (8), but the tables are getting too large for the large fields used in ECC. For multiplication in large fields, the best algorithm is the “left to right” method ... (see http://www.amazon.com/Elliptic -Cryptography-Springer-Professional-Computing / dp / 038795273X algorithm 2.36, page 50).

+1
source

Perhaps you could write an assembly to improve things a bit. However, I would be very surprised if this were a bottleneck in your application; did you do any profiling? This feature does not seem optimal.

0
source

This is probably not what you are looking for, but here is one slight acceleration:

Pass only one argument if they are guaranteed to be the same.

0
source

This may help the compiler to mark "a" and "b" a little as const. Or unwind the loop manually. It would be sad if that helped, though ...

Isn't that a patent minefield, by the way?

0
source

Source: https://habr.com/ru/post/1277038/


All Articles