Thanks for watching my question. I am trying to make a classification of images based on some pre-prepared models, images should be assigned to 40 classes. I want to use a pre-prepared VGG and Xception model to convert each image into two 1000-dimensional vectors and add them to a 1 * 2000 size vector, since the input of my network and network has 40 measurements output. The network has 2 hidden layers, one with 1024 neurons and the other with 512 neurons.
Structure: image-> vgg (sizes 1 * 1000), xception (sizes 1 * 1000) β (sizes 1 * 2000) as input β 1024 neurons β 512 neurons β 40 dimensional output β softmax
However, using this structure, I can only achieve 30% accuracy. So my question is, how can I optimize the structure of my networks to achieve higher accuracy? I am new to deep learning, so Iβm not sure that my current design is βcorrectβ. I look forward to your advice.
source share