Improving stepping through an array twice (nested loop in the same array)

I have a large dataset that I want to scroll through to determine various statistics for the dataset from the time point "D1" to the time point in the future "D2". Basically, I want to add to the database every time the difference between the values ​​is greater than 10. For example:

Datum[] data = x; for( Datum d1 : data ){ Datum[] tail = y; //From d1 up to 10 elements ahead for( Datum d2 : tail ){ //Calculate difference if( (d2.val - d1.val) > 10 ){ //Insert into database } } } 

My question is, is there a better algorithm / method for this? Since 9 elements from the tail are reused in the next iteration of the outer loop, can I somehow benefit from this? My goal was to make it a lot smaller than (Big-O Notation) O (n 2 ), but I can't wrap my head around it. Typically, finding a pair D1, D2 that satisfies the criteria means that the next element D1 will have a greater chance of matching. Can I use this to my advantage?

I am trying to make this as efficient as possible because the data set is so large.

+6
source share
4 answers

An index-based loop can work much better than an iterator, since you can directly index the original array and not copy it to a new array. You will have much better memory localization, less chance of a false exchange, etc.

+1
source

you have the classic sweepline algorithm, which is O (k * n) with k "overlap" or the part over which the inner loop passes. in your case it is a maximum of 10 no matter what n

 Datum[] data = x; for(int i=0;i<data.length;i++ ){ Datum d1=data[i]; Datum[] tail = y; //From d1 up to 10 elements ahead for(int j=i+1;j<Math.min(i+10,data.length);i++){ d2 = data[j]; //Calculate difference if( (d2.val - d1.val) > 10 ){ //Insert into database break;//inner loop } } } 
+1
source

In your shoes, the first thing I would like to do is profile a typical dataset and find out where the time is going. This should give some clues as to where to focus your optimization efforts.

Assuming computation is as simple as subtraction / comparison, and arrays are quickly available, your assumption about the need to optimize storage in the database should be the next priority. For example, writing a text file and using mass insertion can give very high performance compared to individual inserts. If you stick to using separate inserts and use JDBC, then batch updates will be of great help as they avoid latency when communicating with the database.

If this is not fast enough yet, consider splitting the array into N sections and treat each section as a separate thread. This will be especially effective if processing is CPU related.

Finally, look for code-level optimizations, such as avoiding iterators with an index. If the number of elements written to the database is small compared to the number of iterations of elements, then creating an iterator can be a bottleneck.

If the number of elements is more than 10 and critically more than can fit in the processor cache, it is more efficient to break the scan into smaller blocks. For example, instead of scanning 1000 elements from data2, open it (say) 10 scans 100, each of 10 scans using a different value d1. This is similar to how matrix multiplication is implemented in a block-based way and makes better use of processor caches.

Although you use two loops, which are usually O (N ^ 2), the second loop has a fixed size of 10 elements, so it comes down to a simple constant coefficient - you do about 10 times more work.

0
source

There is an asymptotically faster way to solve this problem, but I have serious doubts as to whether it will work faster in practice, because the size of your window (10) is so small. If you want to increase this size, which I will call k, to be larger, you may want to choose the approach shown below.

When you use this algorithm, you need to maintain a window of k elements that support two operations:

  • Insert a new item, choosing the oldest.
  • Returns all items that exceed some values.

One way to do this is to store all of your elements in a data structure that combines a balanced binary search tree and a queue. The queue contains all k elements stored in the order in which they are displayed in the original sequence, and is used so that we can remember which element to evict when we need to add a new element. Balanced BST stores a copy of each of these items stored in sorted order. This means that you can implement the operations described above as follows:

  • Insert a new item by cutting out the oldest: Add a new item to the queue and BST. Then exit the queue to get the oldest item, then remove it from the BST. Runtime: O (log k), since BST has k elements in it.
  • Returns all elements that exceed some value:. Using BST, find the smallest element at least as large as the O (log n) value. Then scan through BST and list all elements at least as large as that element. This takes O (z) time, where z is the total number of matches found.

In total, if you have n elements and only z pairs that you need to insert into the database, this algorithm will take O (n log k + z) time. To verify this, note that we perform a total of n copies of operation (1), each of which takes O (log k) time. We also perform n copies of operation (2), which takes O (n log k) time to find successors, and then O (z) the total time in all iterations to list all matching pairs.

The asymptotic execution time of this algorithm is good compared to the O (nk) algorithm that you originally posted. Assuming the number of matches is not "really huge" (say, of the order of nk), it will be much faster with increasing n and k.

If the values ​​you store are integers in a small range (say 0 - 10,000), you can speed it up even more by replacing a balanced BST data structure optimized for integers like van Emde Boas tree , which reduces it to O (n log log k + z). Again, this happens faster asymptotically , and if you keep the constant k at 10, it's almost certainly not worth it.

Hope this helps!

0
source

Source: https://habr.com/ru/post/896171/


All Articles