MLP Neural network does not train correctly, probably converges to a local minimum

I am creating a backpropagation MLP neural network in Matlab. The problem is that it does not seem to be able to efficiently handle curves in a function, and also does not scale well with values. For example, it can reach 80% of cos (x), but if I put 100 * cos (x), it just won’t train at all.

What is even more strange is that he can train some functions well, while others simply do not work at all. For example: Well prepared: http://img515.imageshack.us/img515/2148/coscox3.jpg

Not so good: http://img252.imageshack.us/img252/5370/cos2d.jpg (smoothness from a long stay)

Incorrect results, such as: http://img717.imageshack.us/img717/2145/ex2ug.jpg

This is the algorithm I'm trying to implement:

http://img594.imageshack.us/img594/9590/13012012001.jpg

http://img27.imageshack.us/img27/954/13012012002.jpg

And this is my implementation:

close all;clc; j=[4,3,1]; %number neurons in hidden layers and output layer i=[1,j(1),j(2)]; X=0:0.1:pi; d=cos(X); %-----------Weights------------% %-----First layer weights------% W1p=rand([i(1)+1,j(1)]); W1p=W1p/sum(W1p(:)); W1=rand([i(1)+1,j(1)]); W1=W1/sum(W1(:)); %-----Second layer weights------% W2p=rand([i(2)+1,j(2)]); W2p=W2p/sum(W2p(:)); W2=rand([i(2)+1,j(2)]); W2=W2/sum(W2(:)); %-----Third layer weights------% W3p=rand([i(3)+1,j(3)]); W3p=W3p/sum(W3p(:)); W3=rand([i(3)+1,j(3)]); W3=W3/sum(W3(:)); %-----------/Weights-----------% V1=zeros(1,j(1)); V2=zeros(1,j(2)); V3=zeros(1,j(3)); Y1a=zeros(1,j(1)); Y1=[0 Y1a]; Y2a=zeros(1,j(2)); Y2=[0 Y2a]; O=zeros(1,j(3)); e=zeros(1,j(3)); %----Learning and forgetting factor-----% alpha=0.1; etha=0.1; sortie=zeros(1,length(X)); while(1) n=randi(length(X),1); %---------------Feed forward---------------% %-----First layer-----% X0=[-1 X(:,n)]; V1=X0*W1; Y1a=tanh(V1/2); %----Second layer-----% Y1=[-1 Y1a]; V2=Y1*W2; Y2a=tanh(V2/2); %----Output layer-----% Y2=[-1 Y2a]; V3=Y2*W3; O=tanh(V3/2); e=d(n)-O; sortie(n)=O; %------------/Feed Forward-----------------% %------------Backward propagation---------% %----Output layer-----% delta3=e*0.5*(1+O)*(1-O); W3n=W3+ alpha*(W3-W3p) + etha * delta3 * W3; %----Second Layer-----% delta2=zeros(1,length(Y2a)); for b=1:length(Y2a) delta2(b)=0.5*(1-Y2a(b))*(1+Y2a(b)) * sum(delta3*W3(b+1,1)); end W2n=W2 + alpha*(W2-W2p)+ (etha * delta2'*Y1)'; %----First Layer-----% delta1=zeros(1,length(Y1a)); for b=1:length(Y1a) for m=1:length(Y2a) delta1(b)=0.5*(1-Y1a(b))*(1+Y1a(b)) * sum(delta2(m)*W2(b+1,m)); end end W1n=W1+ alpha*(W1-W1p)+ (etha * delta1'*X0)'; W3p=W3; W3=W3n; W2p=W2; W2=W2n; W1p=W1; W1=W1n; figure(1); plot(1:length(d),d,1:length(d),sortie); drawnow; end 

My question is: what can I do to fix it? So far, I guess that I have something wrong with the back propagation, in particular, when calculating the delta and weights. Or I have the wrong weight values ​​(too small or independent of the initial input).

+4
source share
1 answer

I am not an expert in this field, but had some experience with Matlab and Java Neural Network Systems.

I can assume that using the toolkit could help you, it helped others that I know.

I can offer several points of information:

  • Do not expect NN to work with all training data, sometimes the data is too complex to classify in this way.

  • Your NN format will have a major impact on convergence performance.

Finally:

  • Such learning algorithms will often train better when normalizing normal parameters to +/- 1. cos (x) is normalized, 100 * cos * (x) is not. This is due to the fact that the required updates to the weights are much larger, and the training system can take very small steps. If you are data with several different ranges, then normalization is vital. May I suggest that you begin at least an investigation into what
+3
source

Source: https://habr.com/ru/post/1390958/


All Articles