I am working on a regression problem using Keras + Tensorflow. And I found something interesting. 1) here are two models that are actually the same, except that the first model uses a globally defined “optimizer”.
optimizer = Adam()
def OneHiddenLayer_Model():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=optimizer)
return model
def OneHiddenLayer_Model2():
model = Sequential()
model.add(Dense(300 * inputDim, input_dim=inputDim, kernel_initializer='normal', activation=activationFunc))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer=Adam())
return model
2) Then I use two schemes for training data sets (training set (scaleX, Y), test set (scaleTestX, testY)).
2.1) Scheme 1. two consecutive fittings with the first model
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=250, batch_size=numBatch, verbose=0)
numpy.random.seed(seed)
model = OneHiddenLayer_Model()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
2.2) Scheme 2. one fitting with a second model
numpy.random.seed(seed)
model = OneHiddenLayer_Model2()
history = model.fit(scaleX, Y, validation_data=(scaleTestX, testY), epochs=500, batch_size=numBatch, verbose=0)
predictY = model.predict(scaleX)
predictTestY = model.predict(scaleTestX)
3). Finally, the results are displayed for each scheme, as shown below (model loss history → forecast on scaleX → forecast on scaleTestX),
3.1) Scheme 1

3.2) Scheme 2 (with 500 eras)

3.3) add another test with circuit 2 and set epochs = 1000

, Scheme1 , Scheme2, Scheme2 .
- , Scheme1 ? !