How to optimize a neural network for classifying images using pre-processed models

Thanks for watching my question. I am trying to make a classification of images based on some pre-prepared models, images should be assigned to 40 classes. I want to use a pre-prepared VGG and Xception model to convert each image into two 1000-dimensional vectors and add them to a 1 * 2000 size vector, since the input of my network and network has 40 measurements output. The network has 2 hidden layers, one with 1024 neurons and the other with 512 neurons.

Structure: image-> vgg (sizes 1 * 1000), xception (sizes 1 * 1000) β†’ (sizes 1 * 2000) as input β†’ 1024 neurons β†’ 512 neurons β†’ 40 dimensional output β†’ softmax

However, using this structure, I can only achieve 30% accuracy. So my question is, how can I optimize the structure of my networks to achieve higher accuracy? I am new to deep learning, so I’m not sure that my current design is β€œcorrect”. I look forward to your advice.

+5
source share
2 answers

I'm not quite sure that I understand your network architecture, but some parts do not suit me.

There are two main scenarios of learning transfer:

  • ConvNet as a means of allocating a fixed function . Take a pre-prepared network (either VGG and Xception will do, you don’t need both), delete the last fully connected level (the outputs of these levels are estimates of 1000 classes for another task, such as ImageNet), and then treat the rest of ConvNet as a fixed extractor characteristics for the new dataset. For example, in AlexNet, this would compute the 4096-D vector for each image containing the activation of the hidden layer immediately before the classifier. After you extract the 4096-D codes for all the images, prepare a linear classifier (for example, linear SVM or Softmax classifier) ​​for a new dataset.

    Tip # 1: Take only one pre-prepared network.

    Tip # 2: No need for multiple hidden layers for your own classifier.

  • Fine tune ConvNet . The second strategy is not only to replace and retrain the classifier on top of ConvNet on the new dataset, but also to fine-tune the weight of the pre-prepared network while continuing the reverse deployment. You can configure all levels of ConvNet, or you can keep some of the earlier levels fixed (due to retraining problems) and only fine-tune some of the higher-level parts of the network. This is motivated by the observation that earlier ConvNet functions contain more general functions (e.g., edge detectors or color spot detectors) that should be useful for many tasks, but later ConvNet layers gradually become more specific for class details contained in the original dataset.

    Tip # 3: Fix the early pre-processed layers.

    Tip # 4: Use a slow learning speed to fine tune, because you don’t want to distort the other pre-processed layers too much.

This architecture was much more like the ones I saw that solve the same problem and have a higher chance of getting high accuracy.

+3
source

There are several steps you can try if the model does not fit:

  • Increase your learning time and decrease your learning speed. It could be a stop with very poor local options.
  • Add additional layers that can extract specific functions for a large number of classes.
  • Create multiple two-tier deep networks for each class (class yes or no). This will allow each network to be more specialized for each class, rather than train one single network to learn all 40 classes.
  • Enlarge training samples.
+1
source

Source: https://habr.com/ru/post/1271864/


All Articles