Saving and reading a TensorFlow checkpoint

I have a TensorFlow-based neural network and a set of variables.

The learning function is as follows:

def train(load = True, step)
    """
    Defining the neural network is skipped here
    """

    train_step = tf.train.AdamOptimizer(1e-4).minimize(mse)
    # Saver
    saver = tf.train.Saver()

    if not load:
        # Initalizing variables
        sess.run(tf.initialize_all_variables())
    else:
        saver.restore(sess, 'Variables/map.ckpt')
        print 'Model Restored!'

    # Perform stochastic gradient descent
    for i in xrange(step):
        train_step.run(feed_dict = {x: train, y_: label})

    # Save model
    save_path = saver.save(sess, 'Variables/map.ckpt')
    print 'Model saved in file: ', save_path
    print 'Training Done!'

I called the training function as follows:

# First train
train(False, 1)
# Following train
for i in xrange(10):
    train(True, 10)

I trained this way because I needed to transfer different data to my model. However, if I call the train function this way, TensorFlow will generate an error message indicating that it cannot read the saved model from the file.

After some experiments, I discovered that this was due to the slow saving of control points. Before the file was written to disk, the next function of the train will start reading, thus creating an error.

I tried using the time.sleep () function to make some delay between each call, but that did not work.

- , ​​ /? !

+4
1

: , train(), TensorFlow , . , , tf.train.Saver(), train(). , , _N, :

  • Saver, var_a, var_b.
  • Saver, var_a, var_b, var_a_1, var_b_1.
  • Saver, var_a, var_b, var_a_1, var_b_1, var_a_2, var_b_2.
  • .

tf.train.Saver - op. , var_a_1 var_a, .

, train(). - train() :

# First train
with tf.Graph().as_default():
    train(False, 1)

# Following train
for i in xrange(10):
    with tf.Graph().as_default():
        train(True, 10)

... , , with train().

+5

Source: https://habr.com/ru/post/1618744/


All Articles