An effective algorithm for determining values ​​that are not so often found in the list

I am creating a quiz application that randomly pulls questions from a question pool. However, there is a requirement that the question pool be limited to questions that the user has not yet seen. If, however, the user sees all the questions, then the algorithm should "reset" and show only those questions that the user saw once. That is, always show custom questions that they never saw, or, if they saw them all, always show them questions that they saw less often before showing questions that they saw more often.

List (L) is created in such a way that the following is true: any value in list (I) can exist once or be repeated several times in the list. Let it define another value in the list J, so that it does not coincide with I. Then 0 <= abs(frequency(I) - frequency(J)) <= 1it will always be true.

In other words: if a value is repeated in the list 5 times and 5 times - this is the maximum number of times when a value is repeated in the list, then all values ​​in the list will be repeated either 4 or 5 times. The algorithm must return all the values ​​in the list with frequency == 4, before it returns any with frequency == 5.

Sorry, this is so verbose, I'm trying my best to identify this problem. Please feel free to leave comments with questions, and I will further qualify if necessary.

Thanks in advance for any help you can provide.

Explanation

Thanks for the suggested answers. I do not think that they are still there. Let me clarify again.

I do not interact with the user and ask them questions. I assign question IDs to record the exam so that when the user starts the exam, a list of questions that they have access to is determined. Therefore, I have two data structures to work with:

  • List of possible question IDs to which the user has access to
  • A list of all question IDs that this user has ever assigned. This is the list of L described above.

So, if I'm not mistaken, the algorithm / solution to this problem will need to include list operations / amp; / or based on a set using the two lists described above.

, , .

+4
5

:

from collections import Counter
import random

# the number of question ids I need returned to
# assign to the exam
needed = 3

# the "pool" of possible question ids the user has access to
possible = [1,2,3,4,5]

# examples of lists of question ids I might see that represent
# questions a user has already answered
answered1 = []
answered2 = [1,3]
answered3 = [5,4,3,2]
answered4 = [5,4,3,2,1,1,2]
answered5 = [5,4,3,2,1,1,2,3,4,5,1]
answered6 = [5,4,3,2,1]

def getdiff(answered):
    diff = set(possible) - set(answered)
    still_needed = needed - len(diff)
    if still_needed > 0:
        not_already_selected = list(set(possible) - diff)
        random.shuffle(not_already_selected)
        diff = list(diff) + not_already_selected[0:still_needed]
        random.shuffle(diff)
        return diff
    diff = list(diff)
    random.shuffle(diff)
    if still_needed == 0:
        return diff
    return diff[0:needed]

def workit(answered):
    """ based on frequency, reduce the list down to only
        those questions we want to consider "answered"
    """
    have_count = 0
    if len(possible) > len(answered):
        return getdiff(answered)
    counted = Counter(answered)
    max_count = max(counted.values())
    # the key here is to think of "answered" questions as
    # only those that have been seen with max frequency
    new_answered = []
    for value, count in counted.iteritems():
        if count == max_count:
            new_answered.append(value)
    return getdiff(new_answered)

print 1, workit(answered1)
print 2, workit(answered2)
print 3, workit(answered3)
print 4, workit(answered4)
print 5, workit(answered5)
print 6, workit(answered6)

"""
>>> 
1 [2, 4, 3]
2 [2, 5, 4]
3 [5, 2, 1]
4 [5, 3, 4]
5 [2, 4, 3]
6 [2, 3, 5]
>>> ================================ RESTART ================================
>>> 
1 [3, 1, 4]
2 [5, 2, 4]
3 [2, 4, 1]
4 [5, 4, 3]
5 [4, 5, 3]
6 [1, 5, 3]
>>> ================================ RESTART ================================
>>> 
1 [1, 2, 3]
2 [4, 2, 5]
3 [4, 1, 5]
4 [5, 4, 3]
5 [2, 5, 4]
6 [2, 1, 4]
"""
0

.

, ( ) : ; len(deck) , : . n - , n n-1 .

, , " " , .

- :

from random import shuffle

def deal():
    question_IDs = get_all_questions(dbconn) # all questions
    shuffle(question_IDs)
    increment_deal_count(dbconn, userID) # how often this student has gotten questions
    return question_IDs


count_deals = get_stored_deals(dbconn, userID) # specific to this user
if count_deals: 
    question_IDs = get_stored_questions(dbconn, userID) # questions stored for this user 
else: # If 0 or missing, this is the first time for this student
    question_IDs = deal()


while need_another_question(): #based on exam requirements
    try:
        id = question_IDs.pop()
    except IndexError:
        question_IDs = deal()
        id = question_IDs.pop() # Trouble if db is ever empty. 

    use_question(id) # query db with the ID, then put question in print, CMS, whatever

# When we leave that while loop, we have used at least some of the questions
# question_IDs lists the *unused* ones for this deal
# and we know how many times we've dealt.

store_in_db(dbconn, userinfo, question_IDs)
# If you want to know how many times a question has been available, it's
# count_deals - (ID in question_IDs)
# because True evaluates to 1 if you try to subtract it from an integer. 
+7

, , , , . , , , , . , , , , , .

+5

, , , .

, , :

import random

def ask_questions(list_of_questions):
    while True:
        random.shuffle(list_of_questions)
        for question in list_of_questions:
            print(question)
            # Python 3 use input not raw_input
            cont = raw_input('Another question?') 
            if not cont:
                break
        if not cont:
            break
+3

"", . , , (, , , ).

, , . , reset .

, , , / . , , , .

0
source

Source: https://habr.com/ru/post/1540080/


All Articles