How to use C ++ STL and boost to determine if two sorted vectors intersect

I have two sorted C ++ std :: vector without duplicates (you can call them sets) and I want to know if they intersect. I do not need a vector of common elements.

I wrote the code at the end of this question using the boost :: set_intersection algorithm in the extended range library (http://www.boost.org/doc/libs/1_50_0/libs/range/doc/HTML/range/link/algorithms/set .html). This code avoids building a set of common elements, but scans all the elements of vectors.

Is it possible to improve my function "crosses" with boost and C ++ STL without using a loop? I would like to dwell on the first common element in vectors, or at least avoid my counterclass.

The extended range library provides "includes" and "set_intersection", but does not "cross". It makes me think that it “crosses” trivially or is provided elsewhere, but I cannot find it.

thanks!

#include <vector> #include <string> #include <boost/assign/list_of.hpp> #include <boost/function_output_iterator.hpp> #include <boost/range/algorithm.hpp> #include <boost/range/algorithm_ext/erase.hpp> template<typename T> class counter { size_t * _n; public: counter(size_t * b) : _n(b) {} void operator()(const T & x) const { ++*_n; } }; bool intersects(const std::vector<std::string> & a, const std::vector<std::string> & b) { size_t found = 0; boost::set_intersection(a, b, boost::make_function_output_iterator(counter<std::string>(&found))); return found; } int main(int argc, char ** argv) { namespace ba = boost::assign; using namespace std; vector<string> a = ba::list_of(string("b"))(string("vv"))(string("h")); vector<string> b = ba::list_of(string("z"))(string("h"))(string("aa")); boost::erase(a, boost::unique<boost::return_found_end>(boost::sort(a))); boost::erase(b, boost::unique<boost::return_found_end>(boost::sort(b))); cout << "does " << (intersects(a, b) ? "" : "not ") << "intersect\n"; return 0; } 
+4
source share
1 answer

First, to respond to a comment, boost set_intersection takes ranges as parameters compared to the STL that accepts iterators.

Other than that, there is no real difference in terms of algorithm and complexity.

As far as I know, there is no ready-made library to do what you want to do, which is just a check if the two sequences are unique and immediately terminate if they are not.

You should also understand that you will always have the “worst case scenario” when they are truly unique.

The complexity is O (N + M), although you can also use binary search in one of the collections, which will make it O (N log M) or O (M log N), and if one of them is much larger than the other it can be big savings. (e.g., N = 1,000,000, M = 20, M log N is only about 400).

You can “reduce” by taking the median of one, find it in the other and compare the subranges in the individual threads.

There is also a “terrible” decision that your functor receives a call at the intersection, thereby choosing you from the loop. (Yes, there is one, even if it is hidden in the algorithm). Perhaps we can write our own, although this O (N + M) is very simple. I will do it with 4 iterators:

  template< typename Iter1, typename Iter2 > bool intersects( Iter1 iter1, Iter1 iter1End, Iter2 iter2, Iter2 iter2End ) { while( iter1 != iter1End && iter2 != iter2End ) { if( *iter1 < *iter2 ) { ++iter1; } else if ( *iter2 < *iter1 ) { ++iter2; } else return true; } return false; } // Predicate version where the compare version returns <0 >0 or 0 template< typename Iter1, typename Iter2, typename Comp > bool intersects( Iter1 iter1, Iter1 iter1End, Iter2 iter2, Iter2 iter2End, Comp comp ) { while( iter1 != iter1End && iter2 != iter2End ) { int res = comp( *iter1, *iter2 ); if( res < 0 ) { ++iter1; } else if ( res > 0 ) { ++iter2; } else return true; } return false; } 
+3
source

Source: https://habr.com/ru/post/1440091/


All Articles