Advanced image processing in deep learning

I am experimenting with a deep study of images. I have about ~ 4000 images from different cameras with different lighting conditions, image resolutions and viewing angle.

My question is: What kind of image preprocessing would be useful to improve object detection? (For example: contrast / color normalization, noise reduction, etc.)

+11
source share
4 answers

For pre-processing images before submitting them to neural networks. It’s better to make the data centered on zero. Then try the normalization method. This will certainly improve accuracy, since the data is scaled in a range exceeding arbitrarily large values ​​or too small values.

An example image would be: -

enter image description here

Here is an explanation of this from 2016 Stanford CS231n lectures.

*

Normalization refers to the normalization of data measurements so that they have approximately the same scale. For image data, there are two common ways to achieve this normalization. One of them is to divide each dimension by its standard deviation as soon as it is centered at zero:
(X /= np.std(X, axis = 0)) . Another form of this pre-processing normalizes each dimension, so that min and max along the dimension are -1 and 1, respectively. It makes sense to apply this preprocessing only if you have reason to believe that different input functions have different scales (or units), but they should have approximately the same value for the learning algorithm. In the case of images, the relative pixel scales are already approximately equal (and are in the range from 0 to 255), so there is no need to perform this additional preprocessing step.

*

Link for the above excerpt: - http://cs231n.imtqy.com/neural-networks-2/

+7
source

This is of course a late response to this post, but hopefully it helps who stumbled upon this post.

Here is an article I found on the Internet Preprocessing image data for neural networks , although I, of course, wrote a well article on how to train the network.

The main point of the article is:

1) As data (images) not numerous in NN, it should be scaled in accordance with the size of the image for which NN is designed, usually it is a square, i.e. 100x100,250x250

2) Consider the MEAN (left image) and STANDARD DEVIATION (right image) values ​​for all input images in your collection of a specific image set

enter image description here

3) Normalization of the input image data is performed by subtracting the average value from each pixel, and then dividing the result by the standard deviation, which accelerates convergence during training of the network. It will look like a Gaussian curve centered at zero enter image description here

4) Dimension reduction An image from RGB in grayscale, the performance of a neural network can be unchanged with respect to this measurement or make the learning task more convenient for solving. enter image description here

+7
source

Read this , hope this will be helpful. The idea is to split the input image into parts. This is called R-CNN ( here are some examples). There are two stages to this process: object detection and segmentation. Object detection is the process where certain objects in the foreground are detected by observing changes in the gradient. Segmentation is the process when objects are combined into an image with high contrast. High-quality image detectors use Bayesian optimization, which can detect what can happen next using a local optimization point.

Basically, in response to your question, all the preprocessing options that you specified seem good. Since the contrast and normalization of color make the computer recognize different objects, and noise reduction will make the gradients more easily distinguishable.

I hope all this information is useful to you!

+1
source

In addition to what is mentioned above, a great way to improve the quality of low resolution (LR) images would be to create super-resolution using deep learning. This would mean creating a deep learning model that converts a low resolution image into a high resolution. We can convert a high resolution image to a low resolution image using degradation features (filters such as blur). Essentially this will mean LR = degradation (HR), where the degradation function will convert a high resolution image to a low resolution. If we can find the inverse function, then we will convert the low resolution image to high resolution. This can be seen as a controlled learning problem and solved with deep learning to find the inverse function. Stumbled upon this interesting article about introducing super-resolution using deep learning. Hope this helps.

0
source

Source: https://habr.com/ru/post/1262181/


All Articles