Avoiding Under- / Overflow with double-click "Simple Summation"

I am struggling with a summation problem that fails with insufficient or overflow.

I have more than 8271571 double values, of which I need the arithmetic mean.

But the main problem is that I'm not smart enough to do this.

I am currently simply summing them up and dividing by size. This does not work in most cases in under or overflow mode, giving me -1. # INF or 1. # INF.

for(size_t j = 0; j < 12; j++) { double a = 0.0; for(size_t i=0; i < Features->size(); i++) { a += Features->at(i)->at(j); } meanVector[j] = a / Features->size(); } 

However, there is no way to say its just a positive or negative value, so I can not set the data type for signing.

I also tried to use the separation constant in the summation or divide by size already when adding them, but this also does not help.

Values ​​can vary, from what I saw during quick viewing, from -20 to +30, but I can’t say for sure.

So maybe someone can give me a hint on how to do the math or use a workaround. It should be possible, but I just don't have enough ideas.

Edit:

The size is never 0, a check is performed before division. In addition, none of the values ​​is in any way. Taking them out, I already check for #IND and NaN.

If I already divided by summation, I think this is also not the correct result?

 a+= Features->at(i)->at(j) / Features->size() 

results in -3.7964983860343639e + 305

but for each iteration. It can't be right and looks like a border

Edit 2:

So, some of you guys were absolutely right. A lot of sh * t garbage happens ..

0: size: 8327571, min: -2.24712e + 307, no more: 3362.12 1: size: 8327571, min: -2.24712e + 307, no more: 142181 2: size: 8327571, min: -2.24712e + 307, max: 59537.8 3: size: 8327571, min: -2.24712e + 307, no more: 236815 4: size: 8327571, min: -2.24712e + 307, max: 353488 5: size: 8327571, min: -2.24712 e + 307, max: 139960 6: size: 8327571, min: 0, max: 0 7: size: 8327571, min: 0, max: 0 8: size: 8327571, min: 0, max: 0 9: size: 8327571, min: 0, max: 0 10: size: 8327571, min: 0, max: 0 11: size: 8327571, min: 0, max: 0

+4
source share
2 answers
  • I have over 8271571 double values, of which I need the arithmetic mean.
  • The values ​​may differ from what I saw during quick viewing, from -20 to +30, but I can’t say for sure.
  • Size is never 0, validation is done before division.

This does not add up. Amount should fit easily into double . There must be something wrong with the data. You can quickly check your values ​​as follows:

 for (size_t j = 0; j < 12; ++j) { std::vector<double> values; values.reserve(Features->size()); for (size_t i = 0; i < Features->size(); ++i) { values.push_back(Features->at(i)->at(j)); } // Find extreme values, including infinity std::cout << j << ": " << "size: " << values.size() << ", min: " << *std::min_element(values.begin(), values.end()) << ", max: " << *std::max_element(values.begin(), values.end()) << std::endl; // Find NaNs for (size_t i = 0; i < Features->size(); ++i) { // Choose one of the following ifs // For C++11 (isnan is a standard thing now) if (std::isnan(Features->at(i)->at(j)) // Or for Visual Studio if (_isnan(Features->at(i)->at(j)) // Or for GCC prior to C++11 if (__builtin_isnan(Features->at(i)->at(j)) { std::cout << "NaN at [" << i << ", " << j << "]" << std::endl; } } } 

You should be able to quickly determine if there is anything strange with the input.

+4
source

You can calculate the average value using the online algorithm, which means that you do not need to add all the values ​​before dividing. Here:

 template< typename NumberType > class ProgressiveMean{ NumberType m_Mean; NumberType m_MeanKMinus1; long m_K; public: ProgressiveMean(); void Seed( NumberType seed ); void AddValue( NumberType newVal ); NumberType getMean() const; }; template< typename NumberType > ProgressiveMean<NumberType>::ProgressiveMean(): m_Mean( 0 ), m_MeanKMinus1( 0 ), m_K( 0 ){ } template< typename NumberType > void ProgressiveMean<NumberType>::Seed( NumberType seed ){ m_MeanKMinus1 = seed m_K = 2; //Start from K = 1, so next one is 2 } template< typename NumberType > void ProgressiveMean<NumberType>::AddValue( NumberType newVal ){ m_Mean = m_MeanKMinus1 + (newVal - m_MeanKMinus1) / m_K; m_MeanKMinus1 = m_Mean; m_K++; } template< typename NumberType > NumberType ProgressiveMean<NumberType>::getMean() const{ return m_Mean; } 

To use this, call Seed with an initial value, call the AddValue for the rest, and when you are done, call getMean .

This idea is from Knut, and I got it from here .

You can also use the library of large numbers.

0
source

Source: https://habr.com/ru/post/1441832/


All Articles