How to efficiently calculate the average "direction" of pixels in a grayscale image?

So, I realized that I can convert the image to shades of gray as follows:

public static Bitmap GrayScale(this Image img) { var bmp = new Bitmap(img.Width, img.Height); using(var g = Graphics.FromImage(bmp)) { var colorMatrix = new ColorMatrix( new[] { new[] {.30f, .30f, .30f, 0, 0}, new[] {.59f, .59f, .59f, 0, 0}, new[] {.11f, .11f, .11f, 0, 0}, new[] {0, 0, 0, 1.0f, 0}, new[] {0, 0, 0, 0, 1.0f} }); using(var attrs = new ImageAttributes()) { attrs.SetColorMatrix(colorMatrix); g.DrawImage(img, new Rectangle(0, 0, img.Width, img.Height), 0, 0, img.Width, img.Height, GraphicsUnit.Pixel, attrs); } } return bmp; } 

Now I want to calculate the average "direction" of pixels.

What I have in mind is that I want to look at the 3x3 region, and then if the left side is darker than the right side, then the direction will be to the right, if the bottom is darker than the upper, then the direction will be up, if the lower left is darker than the upper right, then the direction will be right. (Think of the small vector arrows above each 3x3 area). Perhaps the best example is that you draw a gradation shade of gray in Photoshop and want to calculate at what angle they drew it.

I did things like this MatLab, but that was many years ago. I suppose I could use a matrix similar to ColorMatrix to calculate this, but I'm not quite sure how to do this. Looks like this feature may be what I want; can I convert it to shades of gray (as stated above) and then do something with a matrix of shades of gray to calculate these directions?

IIRC, what I want is very similar to edge detection .

After I calculated these direction vectors, I just go through them and calculated the average direction of the image.

The ultimate goal is that I want to rotate the images so that their average direction is always up; Thus, if I have two identical images, except for one rotated (90, 180 or 270 degrees), they will be oriented the same way (I do not care if the person is upside down).




* snip * Removal of some spam. You can view the changes you want to read the rest of my attempts.

+7
c # image image-processing
Apr 15 2018-12-12T00:
source share
4 answers

The calculation of the average angle is usually bad:

 ... sum += Math.Atan2(yi, xi); } } double avg = sum / (img.Width * img.Height); 

The average value of a set of angles does not have a clear meaning: for example, the average of an angle pointing up and one angle pointing down is a right angle. Is this what you want? Assuming the β€œup” is + PI, then the average between the two angles almost pointing up will be the angle pointing down if one angle is PI- [small value], the other -PI + [small value]. Perhaps this is not what you want. In addition, you completely ignore the power of the edge - most of the pixels in your real images are not edges, so the direction of the gradient is basically noise.

If you want to compute something like a β€œmid-range”, you need to add vectors instead of angles and then calculate Atan2 after the loop. The problem is this: this vector sum does not report anything about the objects inside the image, since gradients pointing in opposite directions cancel each other out. This only tells you about the difference in brightness between the first / last row and the first / last column of the image. Perhaps this is not what you want.

I think the easiest way to orient the images is to create a histogram of the angle: create an array with (for example) 360 cells for 360 Β° gradient directions. Then calculate the angle and gradient value for each pixel. Add each gradient value to the rectangular tray. This will not give you a single angle, but an angle-histogram, which can then be used to orient two images to each other using a simple cyclic correlation.

Here is an embodiment of the Mathematica implementations that I put together to see if this works:

 angleHistogram[src_] := ( Lx = GaussianFilter[ImageData[src], 2, {0, 1}]; Ly = GaussianFilter[ImageData[src], 2, {1, 0}]; angleAndOrientation = MapThread[{Round[ArcTan[#1, #2]*180/\[Pi]], Sqrt[#1^2 + #2^2]} &, {Lx, Ly}, 2]; angleAndOrientationFlat = Flatten[angleAndOrientation, 1]; bins = BinLists[angleAndOrientationFlat , 1, 5]; histogram = Total /@ Flatten[bins[[All, All, All, 2]], {{1}, {2, 3}}]; maxIndex = Position[histogram, Max[histogram]][[1, 1]]; Labeled[ Show[ ListLinePlot[histogram, PlotRange -> All], Graphics[{Red, Point[{maxIndex, histogram[[maxIndex]]}]}] ], "Maximum at " <> ToString[maxIndex] <> "\[Degree]"] ) 

Results with sample images:

enter image description here

The histograms of the angle also show why the average angle cannot work: the histogram is essentially one sharp peak, the other angles are approximately uniform. The average value of this histogram will always be determined by a uniform "background noise". That is why you have an almost identical angle (about 180 Β°) for each of the "real live" images with your current algorithm.

The tree image has one dominant angle (horizon), so in this case you can use the histogram mode (the most frequent angle). But this will not work for every image:

enter image description here

Here you have two peaks. Cyclic correlation should still orient the two images towards each other, but just using the mode is probably not enough.

Also note that the peak in the angle histogram is not β€œup”: in the tree image above, the peak in the angle histogram is probably the horizon. So he points up. In the image of Lena, this is a vertical white stripe in the background - so she points to the right. Simply orienting images using the most common angle will not rotate each image right-side up.

enter image description here

This image has even more peaks: using the mode (or, possibly, any single angle) would be unreliable for the orientation of this image. But the histogram of the angle as a whole should still give you a reliable orientation.

Note. I did not pre-process the images, I did not try to work with the gradient at different scales, I did not process the final histogram. In a real application, you have to configure all these things in order to get the best algorithm for a large set of test images. This is just a quick test to see if an idea can work at all.

Add: To orient two images using this histogram, you would

  • Normalize all histograms, so the area under the histogram is the same for each image (even if some of them are brighter, darker or blurry).
  • Take the histograms of the images and compare them for each rotation that interests you:

For example, in C #:

 for (int rotationAngle = 0; rotationAngle < 360; rotationAngle++) { int difference = 0; for (int i = 0; i < 360; i++) difference += Math.Abs(histogram1[i] - histogram2[(i+rotationAngle) % 360]); if (difference < bestDifferenceSoFar) { bestDifferenceSoFar = difference; foundRotation = rotationAngle; } } 

(you can speed it up with FFT if the length of your histogram is two, but the code will be much more complicated, and for 256 boxes this may not matter)

+9
Apr 16 '12 at 16:28
source share

Well, I can give you another way to do this. Although it will not be so, but I hope that it works for you.

Most likely, your calculations are in order. It’s just that the gradient, once average, ends with a different average than expected. Therefore, I suspect, looking at the image that you feel, there should be a different middle angle in it. Therefore;

  • Convert image to binary.
  • Finding strings using the hough transform
  • Take the longest line and calculate its angle. This should give you the most noticeable angle.
  • You may need pre / post processing to get the correct lines.

And as another approach. Try GIST. This is basically the implementation most commonly used in scene recognition. I believe that your images are real scenes , and therefore I suggest using this approach. This method will give you a vector that you are comparing with different orientation vectors of the same image. This is a very well-known technique and must be applicable in your case.

+1
Apr 16 2018-12-12T00:
source share

Think using the gradient of your image to calculate the direction you need: en.wikipedia.org/wiki/Image_gradient

0
Apr 16 '12 at 6:24
source share

You need to collapse the image with two Gaussian derived kernels (one in X and one in Y). This is actually Lx and Ly in the answer above.

Pre-subtract the average pixel intensity before calculating the summed product between the sliding window (subness of the original image) and the functions of the first order Gaussian derivative.

See for example this tutorial: http://bmia.bmt.tue.nl/people/bromeny/MICCAI2008/Materials/05%20Gaussian%20derivatives%20MMA6.pdf

Select the optimal smoothing factor sigma> = 1.

To calculate the Gaussian kernels, we differentiate after the two-dimensional Gaussian function (known from the normal distribution) with the 1d variable '(x-0) ^ 2', substituted (x ^ 2 + y ^ 2). You can draw it in 2D, for example, in MS Excel.

Good luck

Michael

0
May 23 '12 at 21:45
source share



All Articles