Well, this is a pretty tricky question. I will try to break it down into separate parts and answer each question separately.
Question number 1
blockproc can be used to implement my function in a sliding window using the BorderSize and TrimBorder .
B = blockproc(A,[64,64],fun,'BorderSize',[5,5], 'TrimBorder', 'false');
I understand that this creates a block [64 + 2*5, 64 + 2*5] and applies the @fun function to each block. But since I cannot enter my @fun function in debugging to verify that it works, I cannot be sure that this is what I need. Is my code above for what I need? I know that I get a concatenated result in B , but it should be on an overlapping sliding block. Will it achieve what I need?
After experimenting with blockproc this is really correct, where you can use it to work with handling a moving neighborhood. However, you will need an additional flag that is equal to PadPartialBlocks . The purpose of this flag is that if you extract a block where you are at the outer edges of the image, and you cannot make a block with the specified size, it will be the zero block of this partial block to fit to the same size. Here is a small example to make it work with sliding windows. Suppose we had a matrix such that:
>> A = reshape(1:25,5,5) A = 1 6 11 16 21 2 7 12 17 22 3 8 13 18 23 4 9 14 19 24 5 10 15 20 25
Let's say we wanted to take the average of each 3 × 3 overlapping neighborhoods in the matrix above and the zero fill of those elements that go beyond the boundaries of the matrix. You would do this with blockproc :
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [1 1], 'TrimBorder', false, 'PadPartialBlocks', true);
It is important to note that the block size is 1 x 1 in this case, and BorderSize , which is 1 x 1, is set differently from what you would expect for a 3 x 3 block. To understand why this is so, we need to understand how BorderSize works. For a given block center, BorderSize allows BorderSize to capture values / pixels outside the size of the original block . For those places that go beyond the boundaries of the matrix, by default we put these places to zero. BorderSize allows us to capture 2M + 2N pixels more, where M and N are the desired horizontal and vertical sizes. This would allow us to capture M more pixels both above and below the original block and N more pixels to the left and right of the original block.
Therefore, for a value of 1 in A , if the block size is 1 x 1, this means that the element consists of only 1, and our BorderSize is 1 x 1. This means that our last block will be:
0 0 0 0 1 6 0 2 7
Since our block size is 1, the next block will be centered at 6, and we will get a 3 x 3 grid and so on. It is also important that TrimBorder set to false , so that we can save those pixels that were originally captured during the expansion of the block. The default value is true . Finally, PadPartialBlocks is true to ensure that all blocks are the same size. When you run the above code, we get the result:
B = 1.7778 4.3333 7.6667 11.0000 8.4444 3.0000 7.0000 12.0000 17.0000 13.0000 3.6667 8.0000 13.0000 18.0000 13.6667 4.3333 9.0000 14.0000 19.0000 14.3333 3.1111 6.3333 9.6667 13.0000 9.7778
You can verify that we get the same result using nlfilter , where we can apply the average to 3 x 3 moving neighborhoods:
C = nlfilter(A, [3 3], @(x) mean(x(:))) C = 1.7778 4.3333 7.6667 11.0000 8.4444 3.0000 7.0000 12.0000 17.0000 13.0000 3.6667 8.0000 13.0000 18.0000 13.6667 4.3333 9.0000 14.0000 19.0000 14.3333 3.1111 6.3333 9.6667 13.0000 9.7778
Thus, if you want to use blockproc for sliding operations, you need to be careful how you set the block size and border size accordingly. In this case, the general rule is to always set the block size to 1 x 1 and allow BorderSize to specify the size of each block that you want. In particular, for a block of size K x K you must set BorderSize to floor(K/2) x floor(K/2) respectively. This would make it easier if K were odd.
For example, if you need an average 5 x 5 filtering operation in a sliding window, you should set BorderSize to [2 2] as K = 5 and floor(K/2) = 2 . Therefore, you would do the following:
B = blockproc(A, [1 1], @(x) mean(x.data(:)), 'BorderSize', [2 2], 'TrimBorder', false, 'PadPartialBlocks', true) B = 2.5200 4.5600 7.2000 6.9600 6.1200 3.6000 6.4000 10.0000 9.6000 8.4000 4.8000 8.4000 13.0000 12.4000 10.8000 4.0800 7.0400 10.8000 10.2400 8.8800 3.2400 5.5200 8.4000 7.9200 6.8400
Replication using nlfilter 5 x 5 also gives:
C = nlfilter(A, [5 5], @(x) mean(x(:))) C = 2.5200 4.5600 7.2000 6.9600 6.1200 3.6000 6.4000 10.0000 9.6000 8.4000 4.8000 8.4000 13.0000 12.4000 10.8000 4.0800 7.0400 10.8000 10.2400 8.8800 3.2400 5.5200 8.4000 7.9200 6.8400
I have done some time tests, and it seems that the blockproc used in this context is faster than nlfilter .
Question number 2
The second is im2col . im2col(A,[mn],block_type) will split the block by m into n blocks and arrange them in columns so that each column is a block? If so, how is overlap controlled? And if each block is a column, can I successfully apply the dct2 function for each column? Because I doubt that he will take vectors as input?
You are right that im2col converts each neighborhood of a pixel or block into one column, and the concatenation of these columns forms the output matrix. You can control whether blocks overlap or if block_type parameters are block_type . Set distinct or sliding (default) to control this. You can also control the size of each area using M and N
However, if your goal is to apply dct2 to the output of im2col , then you will not get what you want. In particular, dct2 takes into account the spatial location of each data point in your 2D data and is used as part of the transformation. By transforming each pixel neighborhood into a single column, the two-dimensional spatial relationships that were originally for each block have now disappeared. dct2 expects 2D spatial data, but instead you will specify 1D data. So im2col is probably not what you are looking for. If I understand correctly what you want, you can use blockproc .
Hope this helps!