I am working on a very meager dataset to predict 6 classes. I tried to work with a large number of models and architectures, but the problem remained the same.
When I start training, acc for training will begin to increase slowly, and the loss will decrease where validation will do the exact opposite.
I have really tried to handle retraining, and I just can't believe that is what studies this issue.
What i tried
Transfer training to VGG16:
- exclude the top layer and add a dense layer with 256 units and 6 units of the output softmax layer.
- finetune top block CNN
- finetune top 3-4 blocks CNN
To handle retraining, I use heavy magnification in Keras and falling out after 256 dense layers with p = 0.5.
Creating your own CNN with VGG16-ish architecture:
- including batch normalization where possible
- Regulation L2 on each dense CNN + layer
- Disconnect from somewhere between 0.5-0.8 after each CNN layer + tight + pool
- On-the-fly data growth at Keras
Understanding that maybe I have too many free parameters:
- network reduction to contain only 2 CNN blocks + dense + output.
- engaged in processing in the same manner as described above.
Without exception, all training looks as follows:
Training and validation + accuracy
The last architecture mentioned is as follows:
reg = 0.0001
model = Sequential()
model.add(Conv2D(8, (3, 3), input_shape=input_shape, padding='same',
kernel_regularizer=regularizers.l2(reg)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.7))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D(16, (3, 3), input_shape=input_shape, padding='same',
kernel_regularizer=regularizers.l2(reg)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.7))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(16, kernel_regularizer=regularizers.l2(reg)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(6))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='SGD',metrics=['accuracy'])
Keras flow_from_directory:
train_datagen = ImageDataGenerator(rotation_range=10,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.05,
zoom_range=0.05,
rescale=1/255.,
fill_mode='nearest',
channel_shift_range=0.2*255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
shuffle = True,
class_mode='categorical')
validation_datagen = ImageDataGenerator(rescale=1/255.)
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=1,
shuffle = True,
class_mode='categorical')