The average value of the average grid

I am trying to calculate the average cell size in the following set of points, as shown in the figure: grid . Image was generated using gnuplot:

gnuplot> plot "debug.dat" using 1:2 

The points are almost aligned on a rectangular grid, but not quite. There seems to be an offset (jitter?), Let's say 10-15% along X or Y. How would you efficiently calculate the correct section into tiles, so there is actually only one point on the tile, the size will be expressed as (tilex, tiley) . I use this word in practice, since a 10-15% offset may have moved the point to another adjacent slab.

Just for reference, I manually sorted (hopefully correctly) and extracted the first 10 points:

  -133920,33480 -132480,33476 -131044,33472 -129602,33467 -128162,33463 -139679,34576 -138239,34572 -136799,34568 -135359,34564 -133925,34562 

Just for clarification, the actual tile according to the above description will be (1435,1060), but I'm really looking for a quick automatic way.

+6
source share
1 answer

Do this only for the X coordinate:

1) sort the X coordinates

2) look at the deltas between the two subsequent X coordinates. These deltas will be divided into two categories: either they correspond to spaces between two columns, or spaces between crosses within the same column. Your goal is to find a threshold that separates long spaces from short ones. This can be done by finding a threshold that divides the delta into two groups whose means are farthest from each other (I think)

3) when you have a threshold, separate points in columns. Columns begin and end with deltas corresponding to a previously set threshold

4) calculate the average position of each detected column

5) accept the delta between subsequent columns. Now the problem is that you can get a stray point that will break your columns. Use the median shape to pull away.

6) You must have a reliable assessment of your gridX

An example using your data looking at the X axis:

 -133920 -132480 -131044 -129602 -128162 -139679 -138239 -136799 -135359 -133925 

Sort + Delta:

 5 1434 1436 1440 1440 1440 1440 1440 1442 

Here you can see that there is a very obvious threshold between the small (5) and large (1434 and higher) deltas. 1434 will define your space here

Divide the points into columns:

 -139679|-138239|-136799|-135359|-133925 -133920|-132480|-131044|-129602|-128162 1440 1440 1440 1434 5 1440 1436 1442 1440 

Almost all points are one, except for two -133925 -133920.

Middle grid positions:

 -139679 -138239 -136799 -135359 -133922.5 -132480 -131044 -129602 -128162 

Sorted Delta:

 1436.0 1436.5 1440.0 1440.0 1440.0 1440.0 1442.0 1442.5 

Median:

 1440 

What is the correct answer for your dataset SMALL, IMHO.

+1
source

Source: https://habr.com/ru/post/978942/


All Articles