Sustainable backpropagation of a neural network - a question about the gradient

Question

Sustainable backpropagation of a neural network - a question about the gradient

First, I want to say that I'm really new to neural networks, and I do not understand this very well;)

I made my first implementation of a C # back neural network backsppation. I tested it with XOR and it looks as if it works.

Now I would like to change my implementation to use elastic backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop ).

The definition says: "Rprop takes into account only the sign of the partial derivative for all samples (and not the value) and acts independently on each" weight ".

Can someone tell me what a partial derivative of all patterns is? And how do I calculate this partial derivative for a neuron in a hidden layer.

thanks a lot

UPDATE:

My implementation base on this Java code is: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java

My backPropagate method is as follows:

public double backPropagate(double[] targets) { double error, change; // calculate error terms for output double[] output_deltas = new double[outputsNumber]; for (int k = 0; k < outputsNumber; k++) { error = targets[k] - activationsOutputs[k]; output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error; } // calculate error terms for hidden double[] hidden_deltas = new double[hiddenNumber]; for (int j = 0; j < hiddenNumber; j++) { error = 0.0; for (int k = 0; k < outputsNumber; k++) { error = error + output_deltas[k] * weightsOutputs[j, k]; } hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error; } //update output weights for (int j = 0; j < hiddenNumber; j++) { for (int k = 0; k < outputsNumber; k++) { change = output_deltas[k] * activationsHidden[j]; weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k]; lastChangeWeightsForMomentumOutpus[j, k] = change; } } // update input weights for (int i = 0; i < inputsNumber; i++) { for (int j = 0; j < hiddenNumber; j++) { change = hidden_deltas[j] * activationsInputs[i]; weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j]; lastChangeWeightsForMomentumInputs[i, j] = change; } } // calculate error error = 0.0; for (int k = 0; k < outputsNumber; k++) { error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]); } return error; }

Can I use the variable change = hidden_deltas[j] * activationsInputs[i] as a gradient (partial derivative) to test singing?

+4

neural-network backpropagation

Rafal spacjer May 19, '10 at 11:34

source share

2 answers

The following is an example of part of the implementation of the RPROP training methodology in the Encog library of artificial intelligence. This should give you an idea of how to proceed. I would recommend downloading the entire library, because it will be easier to go through the source code in the IDE, rather than through the svn online interface.

http://code.google.com/p/encog-cs/source/browse/#svn/trunk/encog-core/encog-core-cs/Neural/Networks/Training/Propagation/Resilient

http://code.google.com/p/encog-cs/source/browse/#svn/trunk

Please note that the code is in C #, but will not necessarily translate into another language.

+1

Waleed al-balooshi May 19 '10 at 12:24

source share

king_nak · Accepted Answer · 2010-05-19T12:10:01+0000

I think that “above all templates” simply means “at each iteration” ... take a look at the RPROP document

For a parified derivative: you have already implemented the usual backpropagation algorithm. This is a method for calculating the gradient efficiently ... there you calculate the δ values for single neurons, which are actually negative ∂E / ∂w values, i.e. Derivative for global error parity as a function of weights.

therefore, instead of multiplying the weights with these values, you take one of two constants (η + or η-), depending on whether the sign has changed

Sustainable backpropagation of a neural network - a question about the gradient

More articles: