I got this example of a minimum echo state network (ESN), which I am analyzing while trying to understand Echo State Networks . Unfortunately, I have some problems understanding why this really works. All this breaks down into questions:
- [What determines | What is the] echo state of ESN?
- What makes ESN so easy and fast to learn about complex nonlinear functions like the Mackey-Glass?
First, here is a small piece of code that shows an important part of initialization:
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Generate the ESN reservoir
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rand('seed', 42);
trainLen = 2000;
testLen = 2000;
initLen = 100;
data = load('MackeyGlass_t17.txt');
% Input neurons
inSize = 1;
% Output neurons
outSize = 1;
% Reservoir size
resSize = 1000;
% Leaking rate
a = 0.3;
% Input weights
Win = ( rand(resSize, (inSize+1) ) - 0.5) .* 1;
% Reservoir weights
W = rand(resSize, resSize) - 0.5;
Tank start:
, . initLen X. , X " ". , , :
" " " " X. , X?
, 1.900 , (X 1002x1900). , , -
1 ( , ) u : X(:,t-initLen) = [1;u;x];
:
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Run the reservoir with the data and collect X.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Allocated memory for the design (collected states) matrix
X = zeros((1+inSize) + resSize, trainLen - initLen);
% Vector of reservoir neuron activations (used for calculation)
x = zeros(resSize, 1);
% Update of the reservoir neuron activations
xUpd = zeros(resSize, 1);
for t = 1:trainLen
u = data(t);
xUpd = tanh( Win * [1;u] + W * x );
x = (1-a) * x + a * xUpd;
if ( t > initLen )
X(:,t-initLen) = [1;u;x];
end
end
:
. , , .
, X Wout .
, , - - X, ( ) .
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Train the output
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Set the corresponding target matrix directly
Yt = data(initLen+2:trainLen+1)';
% Regularization coefficient
reg = 1e-8;
% Get X transposed - needed twice therefore it is a little faster
X_T = X';
% Yt * pseudo_inverse(X); (linear regression task)
Wout = Yt * X_T * (X * X_T + reg * eye(1+inSize+resSize))^(-1);
ESN :
: . , , : ", ". , .
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Run the trained ESN in a generative mode. no need to initialize here,
% because x is initialized with training data and we continue from there.
%
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Y = zeros(outSize,testLen);
u = data(trainLen+1);
for t = 1:testLen
xUpd = tanh( Win*[1;u] + W*x );
x = (1-a)*x + a*xUpd;
% Generative mode:
u = Wout*[1;u;x];
% This would be a predictive mode:
%u = data(trainLen+t+1);
Y(:,t) = u;
end
, ( ):

, "", . , , , , - , Echo State Network.