Using GPUImage and GPUImageHoughTransformLineDetector to Detect Selected Text Bounding Block

I use GPUImageHoughTransformLineDetector to try to detect the selected text in the image:

enter image description here

I use the following code to try to determine bounding blue lines:

GPUImagePicture *stillImageSource = [[GPUImagePicture alloc] initWithImage:rawImage]; GPUImageHoughTransformLineDetector *lineFilter = [[GPUImageHoughTransformLineDetector alloc] init]; [stillImageSource addTarget:lineFilter]; GPUImageLineGenerator *lineDrawFilter = [[GPUImageLineGenerator alloc] init]; [lineDrawFilter forceProcessingAtSize:rawImage.size]; __weak typeof(self) weakSelf = self; [lineFilter setLinesDetectedBlock:^(GLfloat *flt, NSUInteger count, CMTime time) { NSLog(@"Number of lines: %ld", (unsigned long)count); GPUImageAlphaBlendFilter *blendFilter = [[GPUImageAlphaBlendFilter alloc] init]; [blendFilter forceProcessingAtSize:rawImage.size]; [stillImageSource addTarget:blendFilter]; [lineDrawFilter addTarget:blendFilter]; [blendFilter useNextFrameForImageCapture]; [lineDrawFilter renderLinesFromArray:flt count:count frameTime:time]; weakSelf.doneProcessingImage([blendFilter imageFromCurrentFramebuffer]); }]; [stillImageSource processImage]; 

Each time I run this regardless of edgeThreshold or 1023 lines, and the final result is as follows:

enter image description here

I don’t understand why changing the threshold does nothing, but I’m sure I don’t understand something. Anyone have any ideas on how best to do this?

+6
source share
1 answer

I just made some improvements for the Hough transform line detector within the framework, which will help with this, but you will need to do additional preprocessing of your image to select this blue frame.

Let me explain how this operation works. Firstly, it detects the edges of the image. For each pixel defined as an edge (right now, I'm using a Canny edge detector for this), the coordinate of that pixel is extracted. Then, each of these coordinates is used to draw a pair of lines in a parallel coordinate space (based on the process described in "Real-time detection of lines using parallel coordinates and OpenGL" from Dubská et al.).

Pixels in a parallel coordinate space where straight lines intersect will increase in intensity. The points of greatest intensity in the parallel coordinate space indicate the presence of a line in the real scene.

However, only pixels that are local maxima for intensity indicate real lines. The task is to determine local maxima for suppressing noise from busy scenes. This is something that I have not completely decided in this operation. In your image above, a huge number of lines are associated with a mess of points located above the detection threshold in a parallel coordinate space, but not deleted properly so as not to be local maxima.

However, I made some improvements, so now I get a cleaner result from the operation (I just quickly did it with a live video of my screen):

enter image description here

I fixed the error in the local non-simulation suppression filter and expanded the area in which it works from 3x3 to 5x5. It still leaves a bunch of not maximum points that contribute to noise, but it is much better.

You will notice that this is still not quite what you want. It collects lines in text, but not in your inbox. This is because the black text on a white background creates very strong, very sharp edges at the edge detection stage, but the light blue selection box on a white background requires an extremely low threshold to even be raised in any edge detection process.

If you always collect a blue square, then I would recommend that you perform a preprocessing operation to uniquely identify the blue objects in the scene. An easy way to do this is to define a custom filter that subtracts the red component from blue for each pixel, negative flooring values ​​and accepts the result of this calculation as an output for red, green, and blue channels. You might even want to multiply the result by 2.0-3.0 to amplify this difference.

The result should be an image in which the blue areas in your image show both white and everywhere, like black. This will greatly improve the contrast around your selection box and make text selection easier. You will need to experiment with the correct parameters to make it as reliable as in your case.

+6
source

Source: https://habr.com/ru/post/979741/


All Articles