Keras: How to use layer weights as a loss function?

I perform the user loss function in keras. The model is a autoencoder. The first level is the Embedding layer, which inserts the input size (batch_size, sentence_length)into (batch_size, sentence_length, embedding_dimension). Then the model compresses the embedding into a vector of some dimension, and finaly should restore the embedding (batch_size, sentence_lenght, embedding_dimension).

But the implementation layer is trainable, and the loss should use the weight of the attachment layer (I must summarize all the dictionary attachments of my dictionary).

For example, if I want to train on a toy, for example: "cat". sentence_length is 2and suppose embedding_dimension is 10and vocabulary size is 50, therefore, the implementation matrix has the form (50,10). The output of the input layer Xhas the form (1,2,10). Then it passes in the model, and the output X_hatalso takes the form (1,2,10). The model should be trained to maximize the likelihood that the vector X_hat[0]representing "the" is most similar to the vector X[0]representing "the" in the Embedding layer, and the same for "cat". But the loss is such that I have to calculate the similarity of the cosines between Xand X_hat, normalized by the sum of the cosine similarity X_hatand each attachment (50, since the dictionary size is 50) in the attachment matrix, which are the column weights of the embedding layer.

But how can I access the weights in the implementation layer at each iteration of the learning process?

Thank!

+4
source share
1 answer

It seems a little crazy, but it seems to work: instead of creating a custom loss function that I would pass to model.compile, the network calculates the loss (equation 1 from arxiv.org/pdf/1708.04729.pdf ) in the function I call using Lambda:

loss = Lambda(lambda x: similarity(x[0], x[1], x[2]))([X_hat, X, embedding_matrix])    

And the network has two outputs: X_hatand loss, but I weight X_hathas weight and weight 0 to have all weight:

model = Model(input_sequence, [X_hat, loss])
model.compile(loss=mean_squared_error,
              optimizer=optimizer,
              loss_weights=[0., 1.])

When I train the model:

for i in range(epochs):
    for j in range(num_data):
        input_embedding = model.layers[1].get_weights()[0][[data[j:j+1]]]
        y = [input_embedding, 0] #The embedding of the input
        model.fit(data[j:j+1], y, batch_size=1, ...)

Thus, the model learns to lean losstowards 0, and when I want to use the predicted prediction of the model, I use the first output, which is a reconstructionX_hat

0
source

Source: https://habr.com/ru/post/1689405/


All Articles