I want to download a pre-prepared model (optimized by AdadeltaOptimizer) and continue learning with SGD (GradientDescentOptimizer). Models are saved and loaded using the tensor layer API :
save model:
import tensorlayer as tl
tl.files.save_npz(network.all_params,
name=model_dir + "model-%d.npz" % global_step)
loading model:
load_params = tl.files.load_npz(path=resume_dir + '/', name=model_name)
tl.files.assign_params(sess, load_params, network)
If I continue to train with adadelta, the loss of training (cross-entropy) looks fine (start with a close value as a loaded model). However, if I changed the optimizer to SGD, the loss of training would be the same as the new initialized model.
I looked at the file model-xxx.npzfrom tl.files.save_npz. It saves only all model parameters as ndarray. I'm not sure how the optimizer or learning speed is involved.