Where to focus the core when using FFTW to convolve the image?

I am trying to use FFTW to convolve an image.

First, to check if the system is working correctly, I executed fft, then the reverse fft and was able to get the same image.

Then, a small step forward, I used the kernel of identity (that is, the kernel [0] [0] = 1, while all other components are 0). I took the component product between the image and the core (both in the frequency domain), and the inverse fft. Theoretically, I should be able to return an identical image. But the result that I got is not very close to the original image. I suspect this has something to do with where I focus my core before I return it to the frequency domain (since I put β€œ1” in the core [0] [0], this in basically means that I concentrated the positive part at the top left). Can someone enlighten me on what's wrong here?

+6
source share
3 answers

For each measurement, the indices of the samples should be from -n / 2 ... 0 ... n / 2 -1, therefore, if the dimension is odd, the center is around the middle. If the size is even, center so that before the new 0 you have one more sample than after the new 0.

eg. -4, -3, -2, -1, 0, 1, 2, 3 for width / height 8 or -3, -2, -1, 0, 1, 2, 3 for width / height 7.

FFT relative to the middle, there are negative points on its scale.
There are 0 ... n-1 points in memory, but the FFT processes them as -ceil (n / 2) ... floor (n / 2), where 0 is -ceil (n / 2) and n-1 is the floor ( n / 2)

The identity matrix is a matrix of zeros with 1 at location 0.0 (center - according to the above numbering). (In the spatial domain.)

In the frequency domain, the identity matrix must be constant (all real values ​​are 1 or 1 / (N * M) and all imaginary values ​​are 0).

If you do not get this result, then the identification matrix may need to be postponed in different ways (left and down, and not around all sides) - this may depend on the implementation of the FFT.

Center each size separately (this is centering the index, no change in real memory).

You probably need to place the image (after centering) to a full power of 2 in each dimension (2 ^ n * 2 ^ m, where n should not equal m).

Folder relative to the location of the FFT 0,0 (to the center, not to the corner), copying existing pixels to a new enlarged image, using center-based indices in both the source and target images (for example, (0,0) - (0 ), 0), (0,1) - (0,1), (1, -2) - (1, -2))

Assuming your FFT uses regular floating-point cells rather than complex cells, a complex image should be 2 * ceil (2 / n) * 2 * ceil (2 / m) in size, even if you don't need a whole power of 2 (so as she has half the sample, but the samples are complicated).

If your image has more than one color channel , you will first have to reformat it so that the channel is the largest in the sub-pixel order instead of the least significant. You can drag and drop once to save time and space.

Do not forget FFTSHIFT after IFFT. (To swap quadrants.)
The result of IFFT is 0 ... n-1. You should take the pixel of the floor (n / 2) + 1..n-1 and move them to 0 ... floor (n / 2).
This is done by copying pixels to a new image, copying gender (n / 2) +1 to memory location 0, gender (n / 2) +2 to memory location 1, ..., n-1 to memory - (n / 2 ), then 0 to the memory location (n / 2), 1 to the memory location (n / 2) +1, ..., gender (n / 2) in the memory location n -1.

When you multiply in the frequency domain , remember that the patterns are complex (one cell is real and then one cell is imaginary), so you need to use complex multiplication.

As a result, division by N ^ 2 * M ^ 2 may be required, where N is the size n after filling (as well as for M and m). - You can say this (a) looking at the values ​​of the frequency domain of the identity matrix, b) comparing the result with the input.)

+4
source

I think your understanding of the Identity kernel may be turned off. The core of Identity must have 1 in the center of the two-dimensional core, not at position 0, 0.

example for 3 x 3, you have your setup as follows:

1, 0, 0 0, 0, 0 0, 0, 0 

It should be

 0, 0, 0 0, 1, 0 0, 0, 0 

Check it out also

What is the "do-nothing" convolution kernel

also see here at the bottom of page 3.

http://www.fmwconcepts.com/imagemagick/digital_image_filtering.pdf

0
source

I took the component product between the image and the core in the frequency domain, then did the reverse fft. Theoretically, I should be able to return an identical image.

I don’t think that performing a direct conversion with a non-fft kernel, and then an inverse fft conversion should lead to waiting for the original image to return, but maybe I just don’t understand what you were trying to say there ...

-1
source

Source: https://habr.com/ru/post/918980/


All Articles