Std :: set with user-defined type, how to ensure no duplicates

So, I have std :: set, which should support a certain order, and also avoid duplicates of a user-defined type (for me). Now I can make the order work correctly by overloading the '<' statement in my type. However, the kit does not find proper duplicate detection, and to be honest, I'm not quite sure how this is done internally. I overloaded the operator '==', but for some reason I'm not sure if this is what the set actually uses? So the question is, how does a set identify duplicates when adding values? Here is the relevant code:

Custom type:

//! An element used in the route calculation. struct RouteElem { int shortestToHere; // Shortest distance from the start. int heuristic; // The heuristic estimate to the goal. Coordinate position; bool operator<( const RouteElem& other ) const { return (heuristic+shortestToHere) < (other.heuristic+other.shortestToHere); } bool operator==( const RouteElem& other ) const { return (position.x == other.position.x && position.y == other.position.y); } }; 

Thus, elements are equivalent when their position is equivalent, and an element is smaller than another if its combined functionality is less than that of another. Sorting works, but the set will take two elements of one position.

+47
c ++ set
Jul 11 '09 at 10:50
source share
6 answers

operator== not used std::set . Elements a and b are considered equal if !(a < b) && !(b < a)

+85
Jul 11 '09 at 22:53
source share
— -

std::set supports specifying a comparison function. By default, less used, which will use operator < to check for equality. You can define a custom function to check for equality and use it instead:

 std::set<RouteElem, mycomparefunction> myset; 

Note that it is not possible to separate the comparison function from the sort function. std::set is a binary tree, and if the element in the binary tree is not more or less than a certain element, it must be in one place. It does something similar in the location search algorithm:

 if (a < b) { // check the left subtree } else if (b < a) { // check the right subtree } else { // the element should be placed here. } 
+28
Jul 11 '09 at 22:56
source share
Comparator

rlbond does not prevent the insertion of elements that compare the same way. This seems to be hard to prove in the comments, given the character limit, as rlbond seems to believe that std :: set ensures that it will never contain two elements with !compare(a,b) && !compare(b,a) for its comparator. However, the rlbond comparator does not determine the strict order and, therefore, is not a valid parameter for std :: set.

 #include <set> #include <iostream> #include <iterator> #include <algorithm> struct BrokenOrder { int order; int equality; public: BrokenOrder(int o, int e) : order(o), equality(e) {} bool operator<(const BrokenOrder &rhs) const { return order < rhs.order; } bool operator==(const BrokenOrder &rhs) const { return equality == rhs.equality; } }; std::ostream &operator<<(std::ostream &stream, const BrokenOrder &b) { return stream << b.equality; } // rlbond magic comparator struct LessThan : public std::binary_function<BrokenOrder, BrokenOrder, bool> { bool operator()(const BrokenOrder& lhs, const BrokenOrder& rhs) const { return !(lhs == rhs) && (lhs < rhs); } }; int main() { std::set<BrokenOrder,LessThan> s; for (int i = 0; i < 5; ++i) { s.insert(BrokenOrder(i,i)); } for (int i = 0; i < 5; ++i) { s.insert(BrokenOrder(10-i,i)); } std::copy(s.begin(), s.end(), std::ostream_iterator<BrokenOrder>(std::cout, "\n")); } 

Output:

 0 1 2 3 4 3 2 1 0 

Duplicates The magic comparator failed. Different elements in the set have the same value of equality and, therefore, compare them with operator== , because when inserting the set never compared the new element with its duplicate. The only duplicate that was excluded was 4 because the two 4 had sort orders 4 and 6. This put them close enough to each other in the set to compare with each other.

From the C ++ standard: 25.3: 3 "For the algorithms to work correctly, comp must cause strict weak ordering by values."

25.3: 4 "... the requirements are that comp and equiv are both transitive relationships:

 comp(a,b) && comp(b,c) implies comp(a,c)" 

Now consider the elements a = BrokenOrder(1,1) , b = BrokenOrder(2,2) and c = BrokenOrder(9,1) and comp , of course, equal to the magic comparator. Then:

  • comp(a,b) true, since 1! = 2 (equality) and 1 <2 (order)
  • comp(b,c) true since 2! = 1 (equality) and 2 <9 (order)
  • comp(a,c) is false since 1 == 1 (equality)
+6
Jul 14 '09 at 2:08
source share

The STL implementation does something conceptually in such a way as to detect equality:

 bool equal = !(a < b) && !(b < a); 

That is, if two elements are no less than the others, then they must be equal. You can verify this by setting a breakpoint on your operator==() method and checking that it is called at all.

I would generally be suspicious of comparison operators that tested completely different things. Your operator < defined in terms of two things that are independent of how your operator == defined. Generally, you want these comparisons to use consistent information.

+4
Jul 11 '09 at 22:54
source share

You can try something like the following:

 //! An element used in the route calculation. struct RouteElem { int shortestToHere; // Shortest distance from the start. int heuristic; // The heuristic estimate to the goal. Coordinate position; bool operator<( const RouteElem& other ) const { return (heuristic+shortestToHere) < (other.heuristic+other.shortestToHere); } bool operator==( const RouteElem& other ) const { return (position.x == other.position.x && position.y == other.position.y); } }; struct CompareByPosition { bool operator()(const RouteElem &lhs, const RouteElem &rhs) { if (lhs.position.x != rhs.position.x) return lhs.position.x < rhs.position.x; return lhs.position.y < rhs.position.y; } }; // first, use std::set to remove duplicates std::set<RouteElem,CompareByPosition> routeset; // ... add each RouteElem to the set ... // now copy the RouteElems into a vector std::vector<RouteElem> routevec(routeset.begin(), routeset.end()); // now sort via operator< std::sort(routevec.begin(), routevec.end()); 

Obviously there is a copy in the middle that looks slow. But any structure that indexes elements according to two different criteria, therefore, will have some additional overhead for each element compared to a set. All of the above code is O (n log n), assuming your implementation of std :: sort uses introsort.

If you have this, in this scheme you can use unordered_set instead of set to complete the initial definition. Since the hash will have to depend only on x and y, it should be faster than the O (log N) comparisons needed to be inserted into the set.

Edit: just noticed that you said you wanted to “keep” the sort order, and not that you wanted to process everything in the package. Sorry about that. If you want to effectively maintain order and avoid duplicates when adding elements, I would recommend using the set or unordered set defined above, depending on the position, as well as std::multiset<RouteElem> , which will maintain the order of operator< . For each new item, do:

 if (routeset.insert(elem).second) { routemultiset.insert(elem); } 

Remember that this does not guarantee any exceptions. If the second insertion is thrown, then the route has been changed, so the state is no longer compatible. So I think you really need to:

 if (routeset.insert(elem).second) { try { routemultiset.insert(elem); // I assume strong exception guarantee } catch(...) { routeset.erase(elem); // I assume nothrow. Maybe should check those. throw; } } 

Or the equivalent with RAII, which will be more detailed if your code has only one place you've ever used the RAII class, but it would be better if there were a lot of repetitions.

+3
Jul 11 '09 at 23:27
source share

Beware of the consequences of this. It looks like you are trying to do something like A *, and if you try to insert a "duplicate", it will be ignored, even if there is a "better" route.

NOTE. This solution does not work, see one explanation below

 struct RouteElem { int shortestToHere; // Shortest distance from the start. int heuristic; // The heuristic estimate to the goal. Coordinate position; bool operator<( const RouteElem& other ) const { return (heuristic+shortestToHere) < (other.heuristic+other.shortestToHere); } bool operator==( const RouteElem& other ) const { return (position.x == other.position.x && position.y == other.position.y); } }; struct RouteElemLessThan : public std::binary_function<RouteElem, RouteElem, bool> { bool operator()(const RouteElem& lhs, const RouteElem& rhs) const { return !(lhs == rhs) && (lhs < rhs); } }; std::set<RouteElem, RouteElemLessThan> my_set; 
0
Jul 11 '09 at 23:45
source share



All Articles