An elegant way to remove all elements of a vector that are contained in another vector?

While looking at the code, I found a cyclical and algorithmically slow implementation of std :: set_difference

for(int i = 0; i < a.size(); i++) { iter = std::find(b.begin(),b.end(),a[i]); if(iter != b.end()) { b.erase(iter); } } 

It can be easily replaced by sorting (vectors are not sorted) + set_difference, but this requires the allocation of new memory (see my recent Q. Can the given difference be displayed primarily in the input? Why can this not be done "in place").
So my solution would look something like this:

 sort(a.begin(), a.end()); for(size_t i = 0; i < b.size(); i++) { if (binary_search(a.begin(), a.end(), b[i])) { swap(b[i], b[b.size()-1]); //remove current element by swapping with last b.pop_back(); // and removing new last by shrinking } } 

can this be done more elegantly?
elegant is subjective, so in the framework of this Q is defined as a clearer code (ideally, some of the STL algorithms, but I think it is impossible), but without memory allocation and without increasing the complexity of alg.

+9
c ++ stl
Jan 17 '14 at 20:25
source share
5 answers

This is done in O (N + M) if both arrays are sorted.

  auto ib = std::begin(two); auto iter = std::remove_if ( std::begin(one), std::end(one), [&ib](int x) -> bool { while (ib != std::end(two) && *ib < x) ++ib; return (ib != std::end(two) && *ib == x); }); 
+8
Jan 17 '14 at 21:09
source share

Sort b so you can binary search for it to reduce time complexity. Then use erase-delete idiom to discard all elements from a that are contained in b :

 sort( begin(b), end(b) ); a.erase( remove_if( begin(a),end(a), [&](auto x){return binary_search(begin(b),end(b),x);}), end(a) ); 

Of course, you can sacrifice time complexity for simplicity and reduce your code by removing sort() and replacing binary_search() with find() :

 a.erase( remove_if( begin(a),end(a), [&](auto x){return find(begin(b),end(b),x)!=end(b);}), end(a) ); 

This is a matter of taste. In both cases, you do not need heap allocations. By the way, I use automatic lambda parameters, which are C ++ 14. Some compilers already implement this function, such as clang. If you do not have such a compiler, but only C ++ 11, replace auto with the type of the container element.

By the way, this code does not mention any types! You can write a template function to make it work for all types. The first option requires random access iteration b , while the second part of the code does not.

+4
Jan 18 '14 at 0:13
source share

One solution that comes to mind combines remove_if and binary_search . This is as effective as your manual solution, but it can be a little more "elegant" because it uses more features of STL.

 sort(begin(b), end(b)); auto iter = remove_if(begin(a), end(a), [](auto x) { return binary_search(begin(b), end(b), x); }); // Now [begin(a), iter) defines a new range, and you can erase them however // you see fit, based on the type of a. 
+3
Jan 17 '14 at 20:33
source share

The current code is clear enough, as it should be obvious to any programmer what is happening.

Current performance is O ( a.size() * b.size() ), which can be pretty bad depending on actual sizes.

A more concise and STL-like way of describing this is to use remove_if with a predicate that tells you if in is in.

 b.erase(std::remove_if(b.begin(), b.end(), [](const auto&x) { return std::find(a.begin(), a.end(), x) != a.end(); }), b.end()); 

(Not tested, so I could make a syntax error.) I used lambda, but you can create a functor if you are not using the C ++ 11 compiler.

Note that the source code only removes one instance of the value in b , which is also in a . My solution will remove all instances of that value from b .

Note that the find operation is performed over and over, so it is probably best to do this on a smaller vector for better locality of the link.

+1
Jan 17 '14 at 21:00
source share

after thinking for a while, I thought about it (note: replying to my own Q im, not stating that it exceeds the proposed A):

 vector<int64_t> a{3,2,7,5,11,13}, b{2,3,13,5}; set<int64_t> bs(b.begin(), b.end()); for (const auto& num: bs) cout << num << " "; cout << endl; for (const auto& num: a) bs.erase(num); vector<int64_t> result(bs.begin(), bs.end()); for (const auto& num: result) cout << num << " "; 
0
Jan 20 '14 at
source share



All Articles