Please consider this simple example.
nb_samples = 100000
X = np.random.randn(nb_samples)
Y = X[1:]
X = X[:-1]
X = X.reshape((len(Y), 1, 1))
Y = Y.reshape((len(Y), 1))
So, we have basically
Y[i] = X[i-1]
and the model is just a delay operator.
I can learn this model using stateless LSTM, but I want to understand and apply stateful LSTM in Keras here.
So, I'm trying to learn this model using state-based LSTM by setting pairs of values (x, y)one by one(batch_size = 1)
model = Sequential()
model.add(LSTM(batch_input_shape=(1, 1, 1),
output_dim =10,
activation='tanh', stateful=True
)
)
model.add(Dense(output_dim=1, activation='linear'))
model.compile(loss='mse', optimizer='adam')
for epoch in range(50):
model.fit(X_train,
Y_train,
nb_epoch = 1,
verbose = 2,
batch_size = 1,
shuffle = False)
model.reset_states()
But the model does not recognize anything.
Following Marcin’s suggestion, I changed the training code as follows:
for epoch in range(10000):
model.reset_states()
train_loss = 0
for i in range(Y_train.shape[0]):
train_loss += model.train_on_batch(X_train[i:i+1],
Y_train[i:i+1],
)
print '# epoch', epoch, ' loss ', train_loss/float(Y_train.shape[0])
but I still see an average loss of about 1, which is the standard deviation of my randomly generated data, so the model does not seem to be studying.
Do I have something wrong?