Hi to each other, I am trying to implement a sequence for a sequence model using the new seq2seq module, which is under development and released with TF1.0 and 1.1. There is a dynamic_decode function here that returns logins in the form of rnn_output. Then I need to calculate the losses using the output rnn. When I start it naively, just calling tf.contrib.seq2seq.loss.sequence_loss with (rnn_output, weights, logits), it fails:
InvalidArgumentError (see above for traceback): Incompatible shapes: [1856,1,1024] vs. [9600,1,1024]
[[Node: optimize/gradients/loss/sequence_loss/sampled_softmax_loss/Mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](optimize/gradients/loss/sequence_loss/sampled_softmax_loss/Mul_grad/Shape/_3099, optimize/gradients/loss/sequence_loss/sampled_softmax_loss/Mul_grad/Shape_1/_3101)]]
[[Node: optimize/gradients/Add/_824 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:3", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2787_optimize/gradients/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:3"](^_cloopMainDynamicDecoderWithAttention/decoder/decoder/while/BasicDecoderStep/multi_rnn_cell/cell_1/multi_rnn_cell/cell_2/lstm_cell/zeros/_128)]]
This is natural since rnn_output has a dynamic form. I have two possible solutions: 1. "batch" dynamic tensor into a size tensor equal to the maximum allowable length. I donβt know how to pack a dynamic tensor into a fixed-size tensor, but it probably should do something with new interfaces for the dynamic form: tf.while_loop and TensorArrays. It would be great to hear some tips on this. 2. Dynamically calculate sequence_loss. But my knowledge of the implementation of the internal tensor flow is too limited to correctly assess whether this is easy to do. Any suggestions here?
General question
What is the correct approach to calculate the lateral entropy loss of discretized / normal soft max from the dynamic form of dynamic decode rnn_output?
I have the following code:
decoder_outputs, decoder_state = seq2seq.dynamic_decode(my_decoder, output_time_major=False, parallel_iterations=512,
swap_memory = True)
self.logits = decoder_outputs.rnn_output
self.loss = loss.sequence_loss(self.logits, tf.transpose(tf.stack(targets), [1,0], name="targets_"),
tf.transpose(tf.stack(self.target_weights), [1,0], name="weights_"),
softmax_loss_function = softmax_loss_function)
ipdb > tf. '1.1.0-rc0'
python: 2.7