Measure accuracy and recall when raw data is missing.

Trying to improve my chat application:

Using the previous (pre-processed) chat interactions from my domain, I created a tool that offers the user 5 possible utterances for a given chat context, for example:

Raw : "Hi John."

Context : Hello [[USER_NAME]]
Expressions : [Hello, Hello, How are you, Hello, Hello, again]


Of course, the results are not always relevant, for example:

Raw : "Hi John. How are you? I'm fine, are you in the office?"

Context : hello [[USER_NAME]], how are you, I'm fine, you are in the office Speeches : [Yes, No, Hello , Yes, I, How are you )

I am using Elasticsearch using a TF / IDF affinity model and an index structured as follows:

{ "_index": "engagements", "_type": "context", "_id": "48", "_score": 1, "_source": { "context": "hi [[USER_NAME]] how are you i am fine are you in the office", "utterance": "Yes I am" } } 

Problem: I know for a fact that for the context of โ€œhello [[USER_NAME]], how are you, I'm fine, you are in the office, the sayingโ€œ Yes, I โ€,โ€œ Yes โ€,โ€œ No โ€is also relevant because they appeared in similar context.

Trying to use this great video as a starting point

Q: How can I measure accuracy and recall if all I know (from my source data) is just one true statement?

+6
source share

Source: https://habr.com/ru/post/1011877/


All Articles