The most efficient way to use AND or OR operations in bitmap images is to use hardware help. Many GPUs can perform operations on two raster images. There is no standard C ++ library library for this.
You need to perform an operation on each bit, byte, word or double word in bitmap images.
The next speed-efficient method is to deploy a loop. Instructions for working out instructions for working out instructions (which can be used for data instructions) and can clear it of time loss.
You can also get some efficiency through efficient use of the processor data cache. Download a bunch of variables, perform the operation, save the result, repeat.
You should also extract groups using the processor word size. A 32-bit processor loves collecting 32 bits at a time. Thus, this will give you 8 sets of 4-bit pixels that load with a single fetch. Otherwise, you will have to extract 8 bits at a time, which leads to 4 samples of 8 bits compared to 1 sample of 32 bits.
Here's the main algorithm:
uint8_t * p_bitmap_a = &Bitmap_A[0]; uint8_t * p_bitmap_b = &Bitmap_B[0]; uint8_t * p_bitmap_c = &Bitmap_C[0]; // C = A AND B for (unsigned int i = 0; i < bitmap_size / 4; ++i) { uint32_t a = *((uint32_t*) p_bitmap_a); uinte2_t b = *((uint32_t*) p_bitmap_b); uint32_t c = a & b; *((uint32_t *) p_bitmap_c) = c; p_bitmap_a += sizeof(uint32_t); p_bitmap_b += sizeof(uint32_t); p_bitmap_c += sizeof(uint32_t); }
Change 1:
Your processor may have instructions that can help with operations. For example, an ARM7 processor can load many registers from memory with a single instruction. Review the processor instruction set. You may need to use the built-in assembler language to use the instructions of a particular processor.
Edit 2: Threading and Parallel processing.
If your bitmaps are huge, the overhead of supporting multiple threads of execution or parallel execution may outweigh the benefits. For example, if the overhead of synchronizing with another CPU core is 200 ms, and processing the raster image without interruption is 1000 ms, you spent time using parallel processing on one raster image (1200 ms to have another main process - bitmap).
If you have many bitmaps, you can get some time using parallel processing or several threads:
- One stream extracts raster images from the database into memory (buffers).
- The other stream processes the bitmap images and stores it in the outgoing buffer.
- The third process writes buffered bitmaps to the database.
If you are extracting bitmaps from an external source, such as a database, this I / O will be your bottleneck. This is the part you need to optimize or cheat.