Measure accuracy and recall when raw data is missing.

Question

Measure accuracy and recall when raw data is missing.

Trying to improve my chat application:

Using the previous (pre-processed) chat interactions from my domain, I created a tool that offers the user 5 possible utterances for a given chat context, for example:

Raw : "Hi John."

Context : Hello [[USER_NAME]]
Expressions : [Hello, Hello, How are you, Hello, Hello, again]

Of course, the results are not always relevant, for example:

Raw : "Hi John. How are you? I'm fine, are you in the office?"

Context : hello [[USER_NAME]], how are you, I'm fine, you are in the office Speeches : [Yes, No, Hello , Yes, I, How are you )

I am using Elasticsearch using a TF / IDF affinity model and an index structured as follows:

{ "_index": "engagements", "_type": "context", "_id": "48", "_score": 1, "_source": { "context": "hi [[USER_NAME]] how are you i am fine are you in the office", "utterance": "Yes I am" } }

Problem: I know for a fact that for the context of “hello [[USER_NAME]], how are you, I'm fine, you are in the office, the saying“ Yes, I ”,“ Yes ”,“ No ”is also relevant because they appeared in similar context.

Trying to use this great video as a starting point

Q: How can I measure accuracy and recall if all I know (from my source data) is just one true statement?

+6

precision-recall classification elasticsearch chat tf-idf

Shlomi schwartz Oct 31 '16 at 9:31

source share

No one has answered this question yet.

See related questions:

5

5.1 elastic search why stored_fields does not return a query field?

2

How to filter ElasticSearch results based on field value?