Keras: rebuild to connect lstm and conv

This question exists as a github issue . I would like to build a neural network in Keras, which contains both 2D convolutions and the LSTM layer.

The network must classify MNIST. The training data in MNIST is 60,000 images with a gray scale from handwritten digits from 0 to 9. Each image has a size of 28x28 pixels.

I divided the images into four parts (left / right, up / down) and rearranged them in four orders to get sequences for LSTM.

| | |1 | 2| |image| -> ------- -> 4 sequences: |1|2|3|4|, |4|3|2|1|, |1|3|2|4|, |4|2|3|1| | | |3 | 4| 

One of the small sub-images has a size of 14 x 14. Four sequences are stacked together in width (it does not matter whether the width or height).

This creates a vector with the shape [60000, 4, 1, 56, 14], where:

  • 60,000 - number of samples
  • 4 - the number of elements in the sequence (the number of time stamps)
  • 1 - color depth (shades of gray)
  • 56 and 14 - width and height

Now this should be given to the Keras model. The problem is resizing input between CNN and LSTM. I searched the web and found this question: Python keras, how to resize input after convolution layer to lstm layer

The solution is similar to a Reshape layer, which aligns the image, but retains timestamps (unlike the Flatten layer, which destroys everything except batch_size).

Here is my code:

 nb_filters=32 kernel_size=(3,3) pool_size=(2,2) nb_classes=10 batch_size=64 model=Sequential() model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[1,56,14])) model.add(Activation("relu")) model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1])) model.add(Activation("relu")) model.add(MaxPooling2D(pool_size=pool_size)) model.add(Reshape((56*14,))) model.add(Dropout(0.25)) model.add(LSTM(5)) model.add(Dense(50)) model.add(Dense(nb_classes)) model.add(Activation("softmax")) 

This code generates an error message:

ValueError: the total size of the new array must be unchanged

Apparently, the entrance to the Reshape layer is incorrect. As an alternative, I also tried passing timestamps to the Reshape layer:

 model.add(Reshape((4,56*14))) 

This does not seem correct, and in any case, the error remains the same.

Am I doing it right? Is Reshape a suitable tool for connecting CNN and LSTM?

There are quite complex approaches to this problem. For example: https://github.com/fchollet/keras/pull/1456 TimeDistributed Layer, which seems to hide the timestep dimension from the following layers.

Or this: https://github.com/anayebi/keras-extra A set of special layers for combining CNN and LSTM.

Why are such difficult (at least, it seems to me difficult) decisions if simple Reshape does the trick?

UPDATE

Embarrassingly, I forgot that the sizes will be resized by the pool and (due to the lack of laying) the bundle. kgrm advised me to use model.summary() to check the sizes.

The output of the layer in front of the Reshape layer is (None, 32, 26, 5) . I changed the shape for the change: model.add(Reshape((32*26*5,))) .

Now the ValueError value has disappeared, instead the LSTM complains:

Exception: input 0 is not compatible with the lstm_5 layer: expected ndim = 3, found ndim = 2

It seems that I need to pass the measurement of time throughout the network. How can i do this? If I add it to input_shape of Convolution, he also complains: Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[4, 1, 56,14])

Exception: input 0 is incompatible with layer convolution 2D_44: expected ndim = 4, found ndim = 5

+6
source share
1 answer

According to Convolution2D , your input should be 4-dimensional with dimensions (samples, channels, rows, cols) . This is the direct reason you get the error message.

To solve this problem you should use a TimeDistributed wrapper. This allows you to use static (non-repeating) layers over time.

+5
source

Source: https://habr.com/ru/post/1011658/


All Articles