It seems a little crazy, but it seems to work: instead of creating a custom loss function that I would pass to model.compile, the network calculates the loss (equation 1 from arxiv.org/pdf/1708.04729.pdf ) in the function I call using Lambda:
loss = Lambda(lambda x: similarity(x[0], x[1], x[2]))([X_hat, X, embedding_matrix])
And the network has two outputs: X_hat
and loss
, but I weight X_hat
has weight and weight 0 to have all weight:
model = Model(input_sequence, [X_hat, loss])
model.compile(loss=mean_squared_error,
optimizer=optimizer,
loss_weights=[0., 1.])
When I train the model:
for i in range(epochs):
for j in range(num_data):
input_embedding = model.layers[1].get_weights()[0][[data[j:j+1]]]
y = [input_embedding, 0]
model.fit(data[j:j+1], y, batch_size=1, ...)
Thus, the model learns to lean loss
towards 0, and when I want to use the predicted prediction of the model, I use the first output, which is a reconstructionX_hat
source
share