Understanding Character Level Implementation in Keras LSTM

I am new to implementing language models in Keras RNN frameworks. I have a set of discrete words (not from a single paragraph) that have the following statistics,

  • Total word examples: 1953
  • Total number of distinctive characters: 33 (including START, END and *)
  • The maximum length (number of characters) in a word is 10

Now I want to build a model that will take a character and predict the next character in the word. I filled all the words so that they were the same length. So my input is Word_input with the form 1953 x 9 and the target is 1953 x 9 x 33 . I also want to use the Embedding layer. So my network architecture,

    self.wordmodel=Sequential()
    self.wordmodel.add(Embedding(33,embedding_size,input_length=9))
    self.wordmodel.add(LSTM(128, return_sequences=True))
    self.wordmodel.add(TimeDistributed(Dense(33)))
    self.wordmodel.compile(loss='mse',optimizer='rmsprop',metrics=['accuracy'])

As an example, the word "CAT" with the addition represents

- C A T END * * * * (9 )

--- C A T END * * * * * (9 )

, TimeDistributed . batch_size 1. reset .

: ? , , 56%.

. .

+4
1

, .

  • TimeDistributed softmax, multi-classification. , , .

  • softmax   cross-entropy,    . .

. , Pytorch. .

enter image description here

+4

Source: https://habr.com/ru/post/1679432/


All Articles