Sustainable backpropagation of a neural network - a question about the gradient

First, I want to say that I'm really new to neural networks, and I do not understand this very well;)

I made my first implementation of a C # back neural network backsppation. I tested it with XOR and it looks as if it works.

Now I would like to change my implementation to use elastic backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop ).

The definition says: "Rprop takes into account only the sign of the partial derivative for all samples (and not the value) and acts independently on each" weight ".

Can someone tell me what a partial derivative of all patterns is? And how do I calculate this partial derivative for a neuron in a hidden layer.

thanks a lot

UPDATE:

My implementation base on this Java code is: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java

My backPropagate method is as follows:

public double backPropagate(double[] targets) { double error, change; // calculate error terms for output double[] output_deltas = new double[outputsNumber]; for (int k = 0; k < outputsNumber; k++) { error = targets[k] - activationsOutputs[k]; output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error; } // calculate error terms for hidden double[] hidden_deltas = new double[hiddenNumber]; for (int j = 0; j < hiddenNumber; j++) { error = 0.0; for (int k = 0; k < outputsNumber; k++) { error = error + output_deltas[k] * weightsOutputs[j, k]; } hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error; } //update output weights for (int j = 0; j < hiddenNumber; j++) { for (int k = 0; k < outputsNumber; k++) { change = output_deltas[k] * activationsHidden[j]; weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k]; lastChangeWeightsForMomentumOutpus[j, k] = change; } } // update input weights for (int i = 0; i < inputsNumber; i++) { for (int j = 0; j < hiddenNumber; j++) { change = hidden_deltas[j] * activationsInputs[i]; weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j]; lastChangeWeightsForMomentumInputs[i, j] = change; } } // calculate error error = 0.0; for (int k = 0; k < outputsNumber; k++) { error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]); } return error; } 

Can I use the variable change = hidden_deltas[j] * activationsInputs[i] as a gradient (partial derivative) to test singing?

+4
source share
2 answers

I think that “above all templates” simply means “at each iteration” ... take a look at the RPROP document

For a parified derivative: you have already implemented the usual backpropagation algorithm. This is a method for calculating the gradient efficiently ... there you calculate the δ values ​​for single neurons, which are actually negative ∂E / ∂w values, i.e. Derivative for global error parity as a function of weights.

therefore, instead of multiplying the weights with these values, you take one of two constants (η + or η-), depending on whether the sign has changed

+2
source

The following is an example of part of the implementation of the RPROP training methodology in the Encog library of artificial intelligence. This should give you an idea of ​​how to proceed. I would recommend downloading the entire library, because it will be easier to go through the source code in the IDE, rather than through the svn online interface.

http://code.google.com/p/encog-cs/source/browse/#svn/trunk/encog-core/encog-core-cs/Neural/Networks/Training/Propagation/Resilient

http://code.google.com/p/encog-cs/source/browse/#svn/trunk

Please note that the code is in C #, but will not necessarily translate into another language.

+1
source

Source: https://habr.com/ru/post/896859/


All Articles