Tensorflow: The schedule is complete and cannot be changed.

I am trying to save variables through breakpoints in order to implement fault tolerance to my program. I am trying to achieve this using the MonitoredTrainingSession function. Below is my configuration: -

import tensorflow as tf

global_step = tf.Variable(10, trainable=False, name='global_step')
x = tf.constant(2)

with tf.device("/job:local/task:0"):
    y1 = tf.Variable(x + 300)

with tf.device("/job:local/task:1"):
    y2 = tf.Variable(x**2)

with tf.device("/job:local/task:2"):
    y3 = tf.Variable(5*x)

with tf.device("/job:local/task:3"):
    y0 = tf.Variable(x - 66)
    y = y0 + y1 + y2 + y3

model = tf.global_variables_initializer()
saver = tf.train.Saver(sharded=True)

chief = tf.train.ChiefSessionCreator(scaffold=None, master='grpc://localhost:2222', config=None, checkpoint_dir='/home/tensorflow/codes/checkpoints')
summary_hook = tf.train.SummarySaverHook(save_steps=None, save_secs=10, output_dir='/home/tensorflow/codes/savepoints', summary_writer=None, scaffold=None, summary_op=tf.summary.tensor_summary(name="y", tensor=y))
saver_hook = tf.train.CheckpointSaverHook(checkpoint_dir='/home/tensorflow/codes/checkpoints', save_secs=None, save_steps=True, saver=saver, checkpoint_basename='model.ckpt', scaffold=None)

# with tf.train.MonitoredSession(session_creator=ChiefSessionCreator,hooks=[saver_hook, summary_hook]) as sess:

with tf.train.MonitoredTrainingSession(master='grpc://localhost:2222', is_chief=True, checkpoint_dir='/home/tensorflow/codes/checkpoints',
    scaffold=None, hooks=[saver_hook,summary_hook], chief_only_hooks=None, save_checkpoint_secs=None, save_summaries_steps=True, config=None) as sess:

    while not sess.should_stop():
        sess.run(tf.global_variables_initializer())

    while not sess.should_stop():
        result = sess.run(y)
        print(result)

I get the following RuntimeError , which I cannot solve: -

Traceback (most recent call last):
  File "add_1.py", line 39, in <module>
    sess.run(tf.global_variables_initializer())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 1187, in global_variables_initializer
    return variables_initializer(global_variables())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 1169, in variables_initializer
    return control_flow_ops.group(*[v.initializer for v in var_list], name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2773, in group
    deps.append(_GroupControlDeps(dev, ops_on_device[dev]))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2721, in _GroupControlDeps
    return no_op(name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_control_flow_ops.py", line 186, in no_op
    result = _op_def_lib.apply_op("NoOp", name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2199, in create_op
    self._check_not_finalized()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1925, in _check_not_finalized
    raise RuntimeError("Graph is finalized and cannot be modified.")
RuntimeError: Graph is finalized and cannot be modified.
+2
source share
3 answers

The root cause of your error is that MonitoredTrainingSession has completed (frozen) the chart and yours tf.global_variable_initializer()can no longer modify it.

Having said that, there are several things that need attention:

1) Why are you trying to reinitialize all the variables here?

while not sess.should_stop():
    sess.run(tf.global_variables_initializer())

2) , MonitoredTrainingSession, . ChiefSessionCreator. , , (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/monitored_session.py#L243) , MonitoredTrainingSession

+7

, .

import tensorflow as tf

tf.reset_default_graph()
tf.Graph().as_default()
+1

- MonitoredTrainingSession, , , :

import tensorflow as tf

global_step = tf.contrib.framework.get_or_create_global_step()
x = tf.constant(2)
y1 = x + 300
y2 = x**2
y3 = x * 5
y0 = x - 66
y = y0 + y1 + y2 + y3
step = tf.assign_add(global_step, 1)

with tf.train.MonitoredTrainingSession(checkpoint_dir='/tmp/checkpoints') as sess:
    while not sess.should_stop():
        result, i = sess.run([y, step])
        print(result, i)
  • / MonitoredTrainingSession .
  • save_checkpoint_secs, 10- . , : , .
  • ChiefSessionCreator gRPC (. . ops - , , , , .
  • tf.Variable() - .
  • save_summaries_steps , 100 .
0

Source: https://habr.com/ru/post/1674482/


All Articles