Trying to improve my chat application:
Using the previous (pre-processed) chat interactions from my domain, I created a tool that offers the user 5 possible utterances for a given chat context, for example:
Raw : "Hi John."
Context : Hello [[USER_NAME]]
Expressions : [Hello, Hello, How are you, Hello, Hello, again]
Of course, the results are not always relevant, for example:
Raw : "Hi John. How are you? I'm fine, are you in the office?"
Context : hello [[USER_NAME]], how are you, I'm fine, you are in the office Speeches : [Yes, No, Hello , Yes, I, How are you )
I am using Elasticsearch using a TF / IDF affinity model and an index structured as follows:
{ "_index": "engagements", "_type": "context", "_id": "48", "_score": 1, "_source": { "context": "hi [[USER_NAME]] how are you i am fine are you in the office", "utterance": "Yes I am" } }
Problem: I know for a fact that for the context of โhello [[USER_NAME]], how are you, I'm fine, you are in the office, the sayingโ Yes, I โ,โ Yes โ,โ No โis also relevant because they appeared in similar context.
Trying to use this great video as a starting point
Q: How can I measure accuracy and recall if all I know (from my source data) is just one true statement?