C ++ drive library with the ability to delete old samples

In Boost.Accumulator, you can add samples to the battery, and then extract statistical quantities from it. eg:

acc(1.) acc(2.) acc(3.) cout << mean; // 2 

There are many more complex statistics in the library, such as skewness , kurtosis or p_square_cumulative_distribution .

What I would like to do is something like this:

 acc(1.) acc(2.) acc(3.) std::cout << mean(acc); // 2 acc.pop() // withdraw the first value (1.) std::cout << mean(acc); // 2.5 

pop() will work in FIFO (First In First Out) mode. What I'm trying to do is to calculate statistics for my data online (incremental) in a sliding time window.

The battery will have to internally store all values.

I could make my own, but I always wanted to check the existing libraries first, and maybe there is an algorithm that I donโ€™t know about, that it is reasonable to calculate the quantities when data arrives or is sent.

+4
source share
3 answers

Since you mentioned the โ€œslip time windowโ€, one option is to use the average rolling value (there is also the roll amount and the rolling amount), which is the average value for the last N samples. Depending on your needs, you can create separate batteries with different window sizes.

 typedef accumulator_set<double, stats<tag::rolling_mean> > my_accumulator; my_accumulator acc(tag::rolling_window::window_size = 3); acc(1.); acc(2.); acc(3.); std::cout << rolling_mean(acc); // Reset accumulator and use different window size acc = my_accumulator(tag::rolling_window::window_size = 2); acc(2.); acc(3.); std::cout << rolling_mean(acc); 

Also, if you look at their implementation, they use boost/circular_buffer.hpp .

+9
source

You probably just need to save all your patterns in a vector, and then copy them from the vector for each calculation. Something like this: fooobar.com/questions/23762 / ...

+1
source

You probably want to save the data in std::deque instead of a vector, so your insertion and deletion can be of constant complexity. If you use a vector, then it will inevitably be linear.

Also, this is a pretty simple matter of applying the algorithm to a collection. Strange, however, I do not know the set of algorithms that have already been written and tested, despite the seemingly fairly obvious set of available algorithms.

For what it's worth, it's pretty simple to create an adapter to transfer data from the collection to the drive to calculate the statistics (s) that you can talk about. In some cases, the battery probably needs to do a little extra work in order to calculate the results step by step, but I suppose it's pretty rare to lose enough efficiency to take care of it.

+1
source

Source: https://habr.com/ru/post/1436264/


All Articles