Keras LSTM language model using attachments

I am using a language model using keras.

Basically, my word size N is ~ 30,000, I already trained word2vec, so I use attachments and then LSTM, and then predict the next word with a fully connected layer, followed by softmax. My model is written below:

EMBEDDING_DIM = 256
embedding_layer = Embedding(N,EMBEDDING_DIM,weights=[embeddings],
trainable=False)

model = Sequential()
model.add(embedding_layer)
model.add(LSTM(EMBEDDING_DIM))
model.add(Dense(N))   
model.add(Activation('softmax')) 

model.compile(loss="categorical_crossentropy", optimizer="rmsprop")

I have two questions:

  • In this case, you can confirm that we use only the last hidden LSTM layer (followed by a fully connected layer and softmax), and there is nothing like the maximum / average pool of consecutive hidden lstm layers (for example, here to analyze the mood http: // deeplearning.net/tutorial/lstm.html )?

  • , , lstm N (30,000), EMBEDDING_DIM , - mse, "" , , ?

!

+4
1

:

, LSTM . , return_sequences=True. False.

, , , . - , - . Hierachical Softmax.

0

Source: https://habr.com/ru/post/1652443/


All Articles