I have written several K-nearest neighbor query methods that build a list of points closest to a given query point. To maintain this list of neighbors, I use std::priority_queue
, so the top element is the farthest neighbor to the request point. This way, I know if I should click on a new element that is currently being checked (if it is at a shorter distance than the current farthest neighbor) and can pop () the farthest element when my priority queue has more than K elements.
So far so good. However, when I output the elements, I would like to order them from the nearest to the farthest. Currently, I just pull all the items out of the priority queue and put them in the output container (via the iterator), which leads to a sequence of points ordered from the farthest to the nearest, so I call std::reverse
on the output range of the iterator.
As a simple example, a linear search is given, which uses the priority queue (it is obvious that the closest neighbor query methods used are much more complicated):
template <typename DistanceValue, typename ForwardIterator, typename OutputIterator, typename GetDistanceFunction, typename CompareFunction> inline OutputIterator min_dist_linear_search(ForwardIterator first, ForwardIterator last, OutputIterator output_first, GetDistanceFunction distance, CompareFunction compare, std::size_t max_neighbors = 1, DistanceValue radius = std::numeric_limits<DistanceValue>::infinity()) { if(first == last) return output_first; typedef std::priority_queue< std::pair<DistanceValue, ForwardIterator>, std::vector< std::pair<DistanceValue, ForwardIterator> >, detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction> > PriorityQueue; PriorityQueue output_queue = PriorityQueue(detail::compare_pair_first<DistanceValue, ForwardIterator, CompareFunction>(compare)); for(; first != last; ++first) { DistanceValue d = distance(*first); if(!compare(d, radius)) continue; output_queue.push(std::pair<DistanceValue, ForwardIterator>(d, first)); while(output_queue.size() > max_neighbors) output_queue.pop(); if(output_queue.size() == max_neighbors) radius = output_queue.top().first; }; OutputIterator it = output_first; while( !output_queue.empty() ) { *it = *(output_queue.top().second); output_queue.pop(); ++it; }; std::reverse(output_first, it); return it; };
The above is all dandies, with one exception: the output-iterator type is required to be bidirectional and essentially point to a pre-allocated container. Now this practice of storing output in the range prescribed by some output iterator is large and fairly standard (for example, std::copy
and other STL algorithms are good examples). However, in this case, I would like to be able to require only a direct type of output-iterator, which allows the use of reverse input iterators, such as those provided for STL and iostreams containers.
Thus, it comes down to canceling the priority queue before , flushing its contents in the output iterator. So these are the best options I could come up with:
Create std::vector
, unload the contents of the priority queue into it, and unload the elements into the output iterator using the reverse iterator on the vector.
Replace std::priority_queue
sorted container (e.g. std::multimap
), and then upload the contents to the iterator output using the appropriate traversal order.
Is there any other reasonable option?
I used to use std::multimap
in the previous implementation of this algorithm and others, as in my second version above. However, when I switched to std::priority_queue
, the performance gain was significant. Thus, I would prefer not to use the second option, since it really seems that using a priority queue to maintain a list of neighbors is much better than relying on a sorted array. By the way, I also tried std::vector
, which I supported, sorted using std::inplace_merge
, which was better than a multimap, but did not match the priority queue.
As for the first option, which is my best option at the moment, it just seems to me that I need to do this double data transfer (queue → vector → output). I am just inclined to think that there should be an easier way to do this ... something that I am missing.
The first option is really not so bad in this application (given the complexity of the algorithm preceding it), but if there is a trick to avoid this double memory transfer, I would like to know about it.