beam search decoder Textsum gives all <UNK> results

I tested textum with both binary data and gigaword data trained by models and tested. The beam search decoder gives me all the "UNK" results with both datasets and models. I used the default settings.

First, I changed the data interface in data.py and batch_reader.py to read and analyze the article and abstraction from the gigaword dataset. I have prepared a model with more than 90K mini-lots for approximately 1.7 million documents. Then I tested the model on another test suite, but it returned all the results. result of a decoder from a model prepared with gigaword

Then I used the binary data that comes with the code to train a small model with a minimum lot of less than 1,000. I tested the same binary data. It gives all the results in the decoding file, except for a few "for" and ".". result of a decoder from a model prepared using binary data I also examined the tensogram about the loss of training and showed that the training is converging.

In the process of training and testing, I did not change any default settings. Has anyone tried the same as me and found the same problem?

+4
source share
1 answer

, , , , . ( ). , [UNK] s , - , , . , [UNK]

+2

Source: https://habr.com/ru/post/1654738/


All Articles