How to write a function for comparing and ranking many sets of logical (true / false) answers?

I embarked on a project that is much more complex than I imagined. I am trying to plan a system based on logical (true / false) questions and answers. Users in the system can answer any questions from a large set of logical (true / false) questions and present a list in which the most similar users (in order of similarity) will be based on their answers.

I googled far and wide, but still haven't come up with a lot, so I was hoping someone could point me in the right direction. I'd like to know:

What is the best data structure and method for storing such data? I initially assumed that I could create two “questions” and “answers” ​​tables in an SQL database. However, I am not interested in whether it would be easier to compare two sets of answers if they are both listed as a numeric string. That is, 0 = not responding, 1 = true, 2 = false. When comparing strings, one could add scales for “did not answer” = 0, “same answer” = 1, “opposite answer” = -1, creating a similarity score.

How can I compare two sets of answers? In order to be able to “resemble” between these answer sets, I need to write a comparison function. Does anyone know which comparison is best for this problem? I looked at the alignment of the sequences , and I think it might be the right way, but I'm not sure, because this requires data in a long sequence, plus the questions are not related, therefore they are not a natural sequence.

? , , , , , . cluster analysis, , , - , ?

. !

+3
4

SQL ", ", , SQL . TOP, .

, , .

SELECT
    U2.userid,
    SUM(CASE
            WHEN A1.answer = A2.answer THEN 1
            WHEN A1.answer <> A2.answer THEN -1
            WHEN A1.answer IS NULL OR A2.answer IS NULL THEN 0  -- A bit redundant, but I like to make it clear
            ELSE 0
        END) AS similarity_score
FROM
    Questions Q
LEFT OUTER JOIN Answers A1 ON
    A1.question_id = Q.question_id AND
    A1.userid = @userid
LEFT OUTER JOIN Answers A2 ON
    A2.question_id = A1.question_id AND
    A2.userid <> A1.userid
LEFT OUTER JOIN Users U2 ON
    U2.userid = A2.userid
GROUP BY
    U2.userid
ORDER BY
    similarity_score DESC
+1

. , - ( ). , , ( SQL-), . ( 0-2), , - . , .

: , SimilarQuestionAnswers, UserAnswers, SimilarQuestionAnswers. , , . , . ( a , 20, b , 10). , .

, , , - . , - , 1, , , , .

: . SQL-, , . , SQL , , . , .

, , . . , .

, . , , . , +, -, *,/ Math.Whatever() .

, , , . , .

+1

, , , . , 1000 A B, A (2Y, 998N) B (500Y, 500N), "Ys A" , Y B. B Ns A.

+1

(, OkCupid). .

+1

Source: https://habr.com/ru/post/1756228/


All Articles