Let the experiment, right.
1. Example
Let's look at {A,B,C,D} and sort it.
Solution 1: in sets
- Greater than
{A,B,C,D} β B (thus B>A , B>C and B>D ) - Greater than
{A,C,D} β D (thus D>A and D>C ) - Greater than
{A,C} β A (thus A>C )
General order [B,D,A,C]
Solution 2: In pairs
- More between
A and C β A (so A>C ) - More between
B and D β B (thus B>D ) - More between
D and A β D
General order [B,D,A,C]
What is the catch? Obviously, this is more complicated in pairs, here I chose them so that the merger would be simple (none).
2. Notes
a) General ordering
> works so well with full ordering: i.e. for two given elements A and B set, either A>B or B>A If none of these two relations holds, then A and B equivalent.
The problem with Solution 1's approach is that if you represent a user with {A,B,C,D} and A and B equivalent and larger than C and D ... then what should answer
b) transitivity
The relation > transitive, which means that if A>B and B>C , then A>C It is important to use this fact, since you can display the relationship between two elements without even asking the user.
c) What is the purpose?
Is the goal to minimize the number of questions for the user or to minimize the user's work? Because, obviously, it is more difficult for the user to answer questions from the first solution ...
3. Modeling
You can simulate the problem as a "graph" problem.
You start with a set of nodes {A,B,C,D} that represent the values ββyou want to check.
Sorting a set is equivalent to computing the minimum set of oriented edges that connect these nodes, so for any two nodes, the path leads from one to the other. I insist on the minimum.
For example, if I have 2 edges: B>A and B>C , then if I detect than A>C , I will remove B>C , because I can infer it by transitivity. Important properties are that if neither of the two nodes is equivalent, the cardinal of the resulting set of edges is the cardinal of the set of nodes minus 1. In my example (taking into account 4 nodes) this will be 3.
An oracle (or an extremely lucky guy) will thus be able to ask only 3 questions, but still build this schedule ... what should we strive for.
4. How to solve this problem?
Well, let us not have two equivalent elements. This means that if A>B is false, then I can conclude that B>A ...
To represent our little graph, we take an array:
ABCDD . > .
Because of symmetry, we only need a triangle.
Counting the number . , we can see the number of unknown relationships, the ideal solution is to get rid of all these . by asking as few questions as possible to the user.
So a good question is to remove as much as possible . in one fell swoop.
5. My question
And so I have another question:
Select the elements lower than "D" in the following {A,B,C}: _
This question feels better than what's bigger ... because I'm clearly focused on the relationships I want to know ( D?A , D?B and D?C ), while there are great guarantees that I will get so many relationships but I donβt know what in advance.
I try to maximize the usefulness of the question
Of course, for laziness this does not mean: it is still important to take into account the transition when using the results of questions.
6. The study of large sets
With larger sets, you can group the elements and then combine them later, but merging is always a dirty operation: you will probably get answers that you already knew. However, this is a big practice for my little question:
For two ordered (disjoint) sets: [A,B,C,D,...] and [R,S,T,U,...] we will consider 3 questions on the toolbar:
The third question requires a more complex answer, but it is much more suitable for merge situations. In the case of a merger, it is as effective as requesting a sort of elements, since it requests exactly for three relations that we do not know on the board.
7. Conclusion
You now have 4 questions:
- Sort: Sort the following items from largest to lowest {A, B, C, D}? _
- Pivot: Which elements are greater than A in this set {B, C, D}? _
- Greater: Which element is greater in this set {A, B, C, D}? _
- Couple: Which of the elements A and B is bigger? _
(A pair can be considered as a degenerate case of either the Big or the Pivot)
Given that each question gives an n-1 relationship for n nodes, you can try to determine how long it will take for the user to answer the T(n) question, and then find n that maximize T(n)/n ;)