The added value of Keras' fit_generator workout

train_datagen = ImageDataGenerator(
                    rescale=1./255,
                    shear_range=0.1,
                    zoom_range=0.1,
                    rotation_range=5.,
                    width_shift_range=0.1,
                    height_shift_range=0.1)

val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
                    train_data_dir,
                    target_size = (img_width, img_height),
                    batch_size = 20,
                    shuffle = True,
                    classes = TYPES,
                class_mode = 'categorical')

validation_generator = val_datagen.flow_from_directory(
                    val_data_dir,
                    target_size=(img_width, img_height),
                    batch_size=20,
                    shuffle = True,
                    classes = TYPES,
                    class_mode = 'categorical')

model.fit_generator(
                train_generator,
                samples_per_epoch = 2000,
                nb_epoch = 20
            )

Epoch 14/50
 480/2000 [======>.......................] - ETA: 128s - loss: 0.8708

Epoch 13/50
2021/2000 [==============================] - 171s - loss: 0.7973 - acc: 0.7041 

My ImageGenerators reading 2261 training and 567 testing images from a folder. I am trying to train my model with 2000 samples_per_epoch and 20 batch_size. Batch_size is divided by samples_per_epoch, but somehow adds an extra value and shows this warning:

(UserWarning: Epoch contains more than samples_per_epochsamples, which may affect learning outcomes. Set samples_per_epochcorrectly to avoid this warning).

It works with Single-Gpu, but if I try to train with Multi-Gpus it gives this error:

InvalidArgumentError (. ): : [21] [20] [[ Node: Equal = Equal [T = DT_INT64, _device = "/job: localhost/replica: 0/task: 0/gpu: 0" ] (ArgMax, ArgMax_1)]] [[Node: gradients/concat_25_grad/Slice_1/_10811 = _Recvclient_terminated = false, recv_device = "/job: localhost/replica: 0/task: 0/gpu: 1", send_device = "/: /: 0/: 0/CPU: 0", send_device_incarnation = 1, tensor_name = "edge_101540_gradients/concat_25_grad/Slice_1", tensor_type = DT_FLOAT, _device = "/: /: 0/: 0/GPU: 1" ]]

code :

...

+6
3

samples_per_epoch × batch_size. , 2260 . steps_per_epoch = 113 batch_size = 20

0

samples_per_epoch your_train_data.shape [0]

0

, 4 , 4 ( ), 2261/4 = 565.25. 4 * 20 (# GPU * Batch_size). , 4 . , 4.

, . , .

, .

0

Source: https://habr.com/ru/post/1016677/


All Articles