I read all the messages on the network that concern the problem, when people forgot to change the target vector to a matrix, and since the problem remains after this change, I decided to ask my question here. Workarounds are listed below, but new problems appear and I am grateful for the suggestions!
Using the convolution network setting and binary cross-entropy with the sigmoid activation function, I get the problem of size mismatch, but not during training data, only during verification / verification data verification. For some strange reason, from my vectors, the verification vector changes its dimension, and I have no idea why. Learning, as mentioned above, works great. Below is the code, thanks for the help (and sorry to capture the stream, but I did not see any reason to create a new one), most of which was copied from an example of a lasagna lesson.
Workarounds and new challenges:
- Removing "axis = 1" in the valAcc definition helps, but the accuracy of the test remains zero, and the classification of tests always returns the same result, regardless of the number of nodes, layers, filters, etc. I have. Even resizing the training set (I have about 350 samples for each class with 48x64 grayscale images) does not change this. So something seems.
Networking:
def build_cnn(imgSet, input_var=None):
network = lasagne.layers.InputLayer(shape=(None, \
imgSet.shape[1], imgSet.shape[2], imgSet.shape[3]), input_var=input_var)
network = lasagne.layers.Conv2DLayer(
network, num_filters=32, filter_size=(5, 5),
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
network = lasagne.layers.Conv2DLayer(
network, num_filters=16, filter_size=(5, 5),
nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=.25),
num_units=64,
nonlinearity=lasagne.nonlinearities.rectify)
network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)
return network
The target matrices for all sets are created as follows (as an example, the target learning target is used)
targetsTrain = np.vstack( (targetsTrain, [[targetClass], ]*numTr) );
... and variables like anano as such
inputVar = T.tensor4('inputs')
targetVar = T.imatrix('targets')
network = build_cnn(trainset, inputVar)
predictions = lasagne.layers.get_output(network)
loss = lasagne.objectives.binary_crossentropy(predictions, targetVar)
loss = loss.mean()
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(loss, params, learning_rate=0.01, momentum=0.9)
valPrediction = lasagne.layers.get_output(network, deterministic=True)
valLoss = lasagne.objectives.binary_crossentropy(valPrediction, targetVar)
valLoss = valLoss.mean()
valAcc = T.mean(T.eq(T.argmax(valPrediction, axis=1), targetVar), dtype=theano.config.floatX)
train_fn = function([inputVar, targetVar], loss, updates=updates, allow_input_downcast=True)
val_fn = function([inputVar, targetVar], [valLoss, valAcc])
Finally, there are two loops, training and test. The first is normal, the second throws an error, excerpts are lower
# -- Neural network training itself -- #
numIts = 100
for itNr in range(0, numIts):
train_err = 0
train_batches = 0
for batch in iterate_minibatches(trainset.astype('float32'), targetsTrain.astype('int8'), len(trainset)
inputs, targets = batch
print (inputs.shape)
print(targets.shape)
train_err += train_fn(inputs, targets)
train_batches += 1
# And a full pass over the validation data:
val_err = 0
val_acc = 0
val_batches = 0
for batch in iterate_minibatches(valset.astype('float32'), targetsVal.astype('int8'), len(valset)
[inputs, targets] = batch
[err, acc] = val_fn(inputs, targets)
val_err += err
val_acc += acc
val_batches += 1
Error (excerpts)
Exception "unhandled ValueError"
Input dimension mis-match. (input[0].shape[1] = 52, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{eq,no_inplace}(DimShuffle{x,0}.0, targets)
Toposort index: 36
Inputs types: [TensorType(int64, row), TensorType(int32, matrix)]
Inputs shapes: [(1, 52), (52, 1)]
Inputs strides: [(416, 8), (4, 4)]
Inputs values: ['not shown', 'not shown']
Thanks again for the help!