Problems implementing Octave backpropagation

I wrote the code to implement the flashing with the least downs, with which I have problems. I use a machine processor dataset and scaled the inputs and outputs to the range [0 1]

The codes in matlab / octave are as follows:

fastest passing back

%SGD = Steepest Gradient Decent function weights = nnSGDTrain (X, y, nhid_units, gamma, max_epoch, X_test, y_test) iput_units = columns (X); oput_units = columns (y); n = rows (X); W2 = rand (nhid_units + 1, oput_units); W1 = rand (iput_units + 1, nhid_units); train_rmse = zeros (1, max_epoch); test_rmse = zeros (1, max_epoch); for (epoch = 1:max_epoch) delW2 = zeros (nhid_units + 1, oput_units)'; delW1 = zeros (iput_units + 1, nhid_units)'; for (i = 1:rows(X)) o1 = sigmoid ([X(i,:), 1] * W1); %1xn+1 * n+1xk = 1xk o2 = sigmoid ([o1, 1] * W2); %1xk+1 * k+1xm = 1xm D2 = o2 .* (1 - o2); D1 = o1 .* (1 - o1); e = (y_test(i,:) - o2)'; delta2 = diag (D2) * e; %mxm * mx1 = mx1 delta1 = diag (D1) * W2(1:(end-1),:) * delta2; %kxm * mx1 = kx1 delW2 = delW2 + (delta2 * [o1 1]); %mx1 * 1xk+1 = mxk+1 %already transposed delW1 = delW1 + (delta1 * [X(i, :) 1]); %kx1 * 1xn+1 = k*n+1 %already transposed end delW2 = gamma .* delW2 ./ n; delW1 = gamma .* delW1 ./ n; W2 = W2 + delW2'; W1 = W1 + delW1'; [dummy train_rmse(epoch)] = nnPredict (X, y, nhid_units, [W1(:);W2(:)]); [dummy test_rmse(epoch)] = nnPredict (X_test, y_test, nhid_units, [W1(:);W2(:)]); printf ('Epoch: %d\tTrain Error: %f\tTest Error: %f\n', epoch, train_rmse(epoch), test_rmse(epoch)); fflush (stdout); end weights = [W1(:);W2(:)]; % plot (1:max_epoch, test_rmse, 1); % hold on; plot (1:max_epoch, train_rmse(1:end), 2); % hold off; end 

to predict

 %Now SFNN Only function [o1 rmse] = nnPredict (X, y, nhid_units, weights) iput_units = columns (X); oput_units = columns (y); n = rows (X); W1 = reshape (weights(1:((iput_units + 1) * nhid_units),1), iput_units + 1, nhid_units); W2 = reshape (weights((((iput_units + 1) * nhid_units) + 1):end,1), nhid_units + 1, oput_units); o1 = sigmoid ([X ones(n,1)] * W1); %nxiput_units+1 * iput_units+1xnhid_units = nxnhid_units o2 = sigmoid ([o1 ones(n,1)] * W2); %nxnhid_units+1 * nhid_units+1xoput_units = nxoput_units rmse = RMSE (y, o2); end 

RMSE function

 function rmse = RMSE (a1, a2) rmse = sqrt (sum (sum ((a1 - a2).^2))/rows(a1)); end 

I also trained the same data set using RS RSNNS mlp mlp , and the RMSE for a train set (first 100 examples) is about 0.03. But in my implementation, I cannot achieve a lower RMSE than 0.14. And sometimes errors grow for some higher levels of learning, and no learning speed gives me less RMSE than 0.14. Also in the document that I called, it says that the RMSE for train set is about 0.03

I wanted to know where the problem is in the code. I followed the book of Raul Rojas and confirmed that everything is in order.

+4
source share
1 answer

In the reverse request code

  e = (y_test(i,:) - o2)'; 

incorrect because o2 is the exit from the train set, and I find a difference from one example from the test set y_test . The line should be as follows:

  e = (y(i,:) - o2)'; 

which correctly finds the difference between the predicted output of the current model and the target output of the corresponding example.

It took me 3 days to find this, I was lucky to find this error, which prevented me from moving on to further changes.

+3
source

Source: https://habr.com/ru/post/1468830/


All Articles