C ++: select a subset of std :: vector based on predefined element indices

I am looking for an efficient way to crop or copy a subset of an existing std :: vector. The criteria for elements suitable for a subset / remainder is that their index is contained in a separate, predefined std :: vector.

eg std::vector<String> Test = { "A", "B", "C", "D", "E"} std::vector<int> SelectionV = {1,2,5} Result = {"A", "B", "E"} 

I will do this on a very large vector and probably on a regular basis, so I am looking for an efficient method as possible.

An alternative that I am also considering, but again unsure of an effective method ...

As the Test object fills up (in my case it is a third-party specific object), this is the result of one pass through the iterator (direct access to the element is impossible). I was wondering if instead you can add only elements of the Test vector that appear in the counter defined in SelectionV

eg

 int count = 0 for (Iterator.begin, Iterator.end(), Iterator++) { if (count is a number contained in selectionV) add to Test } 

but I assume that this will lead to going through selection V at each iteration, which would be much less efficient than just adding all the elements and then selecting the subset.

Any help is greatly appreciated.

+4
source share
4 answers

You can also use the standard library:

std::vector<std::string> Result(SelectionV.size(), 0);

std::transform(SelectionV.begin(), SelectionV.end(), Result.begin(), [Test](size_t pos) {return Test[pos];});

+3
source

You can sort your SelectionV vector in ascending order, and then you can rewrite your for loop somehow:

 int index = 0, nextInSelectionV = 0; for (Iterator.begin; nextInSelectionV < SelectionV.lengh() && Iterator.end(); Iterator++) { if (index == SelectionV[nextInSelectionV]) { add to Test nextInSelectionV++; } index++; } 
+1
source
  • It depends on how big the Test and how big the SelectionV (as a percentage of Test ), and whether the items in SelectionV repeated. You can potentially optimize by computing Not SelectionV instead.
  • Note that in your example, since SelectionV is an index, not a value, the search is already O (1) fast (this is already a huge plus).
  • If Test and SelectionV not changed, and if they are large, you can also divide SelectionV into n threads and each thread independently search for values ​​in Test and then combine the individual outputs later (as opposed to reducing the map). The disadvantage may be a loss in the CPU cache.
  • For repeated calls, you may want to change the difference between the old SelectionV and the new SelectionV and work with this value. This type of cache optimization will work well for a small number of changes between iterations.

Most importantly, make sure you really need to optimize this before spending time on this (and, even worse, complicate your code).

There is a very high chance that other parts of your application (such as I / O) may be several times slower.

+1
source

Perhaps the following may be useful for someone in the future:

 template<typename T> T vector_select(const std::vector<T>& vector, const std::size_t index) { assert(index < vector.size()); return vector[index]; } template<typename T> class VectorSelector { public: VectorSelector(const std::vector<T>& v) : _v(&v) { } T operator()(const std::size_t index){ return vector_select(*_v, index); } private: const std::vector<T>* _v; }; template<typename T> std::vector<T> vector_select(const std::vector<T>& vector, const std::vector<std::size_t>& index) { assert(*std::max_element(index.begin(), index.end()) < vector.size()); std::vector<T> out(index.size()); std::transform(index.begin(), index.end(), out.begin(), VectorSelector<T>(vector)); return out; } 
0
source

Source: https://habr.com/ru/post/1400775/


All Articles