Over the past few days, I have had a problem serializing data in tfrecord format and then deserializing it using the parse_single_sequence example. I am trying to get data for use with a fairly standard RNN model, however this is my first attempt to use the tfrecords format and its associated pipeline.
Here is an example of a toy to reproduce the problem I am having:
import tensorflow as tf import tempfile from IPython import embed sequences = [[1, 2, 3], [4, 5, 1], [1, 2]] label_sequences = [[0, 1, 0], [1, 0, 0], [1, 1]] def make_example(sequence, labels): ex = tf.train.SequenceExample() sequence_length = len(sequence) ex.context.feature["length"].int64_list.value.append(sequence_length) fl_tokens = ex.feature_lists.feature_list["tokens"] fl_labels = ex.feature_lists.feature_list["labels"] for token, label in zip(sequence, labels): fl_tokens.feature.add().int64_list.value.append(token) fl_labels.feature.add().int64_list.value.append(label) return ex writer = tf.python_io.TFRecordWriter('./test.tfrecords') for sequence, label_sequence in zip(sequences, label_sequences): ex = make_example(sequence, label_sequence) writer.write(ex.SerializeToString()) writer.close() tf.reset_default_graph() file_name_queue = tf.train.string_input_producer(['./test.tfrecords'], num_epochs=None) reader = tf.TFRecordReader() context_features = { "length": tf.FixedLenFeature([], dtype=tf.int64) } sequence_features = { "tokens": tf.FixedLenSequenceFeature([], dtype=tf.int64), "labels": tf.FixedLenSequenceFeature([], dtype=tf.int64) } ex = reader.read(file_name_queue)
Associated stack trace:
Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 594, in call_cpp_shape_fn status) File "/usr/lib/python3.5/contextlib.py", line 66, in exit next(self.gen) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors.py", line 463, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors.InvalidArgumentError: Shape must be rank 0 but is rank 1 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "my_test.py", line 51, in sequence_features=sequence_features File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/parsing_ops.py", line 640, in parse_single_sequence_example feature_list_dense_defaults, example_name, name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/parsing_ops.py", line 837, in _parse_single_sequence_example_raw name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_parsing_ops.py", line 285, in _parse_single_sequence_example name=name) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op op_def=op_def) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2382, in create_op set_shapes_for_outputs(ret) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1783, in set_shapes_for_outputs shapes = shape_func(op) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/common_shapes.py", line 596, in call_cpp_shape_fn raise ValueError(err.message) ValueError: Shape must be rank 0 but is rank 1
I posted this as a potential problem on github, although it seems that I can just use it incorrectly: Tensorflow Github Issue So, given the background information, I'm just wondering if I really find the error here? Any help in the right direction would be greatly appreciated, it was a few days, and my concussion did not work out. Thanks everyone!