Pixel editing code runs quickly in the main application, very slow in Delphi 6 DirectShow filter with other problems

I have a Delphi 6 application that sends bitmaps to a DirectShow DLL in real time, 25 frames per second. The DirectShow DLL is also my code and is also written in Delphi 6 using the DSPACK DirectShow component set. I have a simple block of code that goes through each pixel in the bitmap, changing the brightness and contrast of the image if a certain flag is set, otherwise the bitmap is popped from the Unmodified DirectShow DLL (source filter of the sound source). The code was used in the main application, and then I just moved it to the DirectShow DLL. When it was in the main application, it worked fine. I could see the changes in the bitmap as expected. However, now that the code is in the DirectShow DLL, it has the following problems:

  • When the code block below is active, the DirectShow DLL is very slow. I have an i5 quad core and it is very slow. I also see a big surge in CPU consumption. In contrast, the same code running in the main application worked fine on the old single-core P4 . This noticeably affected the processor on this old machine, but the video was smooth and there were no problems. Images are only 352 x 288 pixels in size.

  • I do not see the expected changes in the visible bitmap. I can trace the code in the DirectShow DLL and see the numerical values ​​of each pixel that have been correctly changed by the code, but the visible image in the Edit Edit ActiveMovie window looks completely unchanged.

  • If I deactivate the code that I can do in real time, the ActiveMovie window shows a video that, as smooth as glass, displays perfectly when the CPU is almost inaccessible. If I reactivate the code, the video will now be very volatile, probably showing only 1 to 2 frames per second with a long delay before the first frame is displayed, and the processor bursts. Not completely, but much more than I expected.

I tried to build the DirectShow DLL with everything, including range checking, overflow checking, etc., and there were no warnings or errors at run time. Then I tried compilation to achieve maximum speed, and it still has the same problems as above. Something is really wrong, and I can’t understand that. Notice, I really lock the canvas before changing the bitmap and unlocking it after I have finished. If it weren’t for starting the all-inclusive compilation, which I mentioned above, I would say that it looked like the FPU exception was raised and silently swallowed with each pixel calculation, but, as I said, errors and exceptions not happening.

UPDATE I put it here so that the solution embedded in one of the Roman R comments is clearly visible. The problem is that I did not set the PixelFormat pf24Bit property before moving on to the ScanLine property. As Roman suggested, not doing this requires TBitmap to create a temporary copy of the bitmap. As soon as I added the line of code below, the problems disappeared, both changes that are not visible, and errors from the page. This is an insidious problem, because the only object that is affected is the pointer that you use to access the ScanLine property, because (the guess) contains a pointer to a temporary copy of the bitmap. That is why the subsequent call to TextOut () still worked, since it worked with the original copy of the bitmap.

clip.PixelFormat := pf24bit; // The missing code line that fixes the problem. 

Here's the code for the code I had in mind:

 function IntToByte(i: Integer): Byte; begin if i > 255 then Result := 255 else if i < 0 then Result := 0 else Result := i; end; // --------------------------------------------------------------- procedure brightnessTurboBoost(var clip: TBitmap; rangeExpansionPowerOf2: integer; shiftValue: Byte); var p0: PByte; x,y: Integer; begin if (rangeExpansionPowerOf2 = 0) and (shiftValue = 0) then exit; // These parameter settings will not change the pixel values. for y := 0 to clip.Height-1 do begin p0 := clip.scanline[y]; // Can't just do the whole buffer as a big block of bytes since the // individual scan lines may be padded for CPU alignment. for x := 0 to (clip.Width - 1) * 3 do begin if rangeExpansionPowerOf2 >= 1 then p0^ := IntToByte((p0^ shl rangeExpansionPowerOf2) + shiftValue) else p0^ := IntToByte(p0^ + shiftValue); Inc(p0); end; end; end; 
+2
source share
1 answer

There are a few things to say about this piece of code.

  • First of all, you are using the Scanline property of the TBitmap class. I haven’t been involved in Delphi for many years, so I could be wrong about that, but I feel that Scanline is actually not a thin accessor, is it? This can be an internal hiding of things that can dramatically affect performance, for example, "if he wants to access the bits of the image, then we must first convert it to a DIB before returning pointers." So a thing that looks so simple can seem like a killer.

  • "if rangeExpansionPowerOf2> = 1 then" in the inner loop of the body? You really do not want to compare this completely. Either make two separate functions, or duplicate the entire cycle without two versions for the zero and non-zero range of ExpansionPowerOf2 and do this if only once.

  • "for ... to (clip.Width - 1) * 3 do" I'm not sure that Delphi will optimize the upper bound estimate to make it only once. You can do these multiplications three times for each pixel, while you could only do this after the whole image.

  • For the top level, IntToByte definitely implemented in MMX to avoid ifs and process several bytes at once.

As you say, the images are only 352x288, I would suggest that # 1 destroys performance.

+3
source

Source: https://habr.com/ru/post/1389613/


All Articles