Section contours for the densest area of ​​the scattering section

I create a scatter plot of data points of ~ 300 KB, and I have a problem that it is so crowded in some places that no structure is visible - so I had a thought!

I want the graph to create a contour graph for the densest parts and leave less dense areas with scatter() data points.

So, I tried to individually calculate the nearest neighbor distance for each of the data points, and then when this distance has reached a certain value, draw a contour and fill it, and then when it reaches a much larger value (less dense) just make a spread ...

I tried and failed for several days, I'm not sure if the conditional contour graph will work in this case.

I would put the code, but it is so dirty and probably just confuse the problem. And it is so computationally intensive that maybe it just ruins my computer if it works!

Thank you all in advance!

ps I searched and searched for an answer! I am convinced that this is impossible even for all the results that he received!

Edit: Thus, the idea is to see where some specific points lie in the structure of the 300k sample. Here is an example plot, my points are scattered in three different ways. colors. My scatter version of the data

I will try to randomly try 1000 data from my data and load it as a text file. Cheers Stackers. :)

Edit: Hey, here are a few examples of the data in 1000 rows - just two columns [X,Y] (or [gi,i] from the graph above) space. Thank you all! data

+6
source share
2 answers

4 years later, and I can finally answer it! this can be done using contains_points from matplotlib.path .

I used astropy Gaussian anti- aliasing , which can be omitted or replaced as needed.

 import matplotlib.colors as colors from matplotlib import path import numpy as np from matplotlib import pyplot as plt try: from astropy.convolution import Gaussian2DKernel, convolve astro_smooth = True except ImportError as IE: astro_smooth = False np.random.seed(123) t = np.linspace(-1,1.2,2000) x = (t**2)+(0.3*np.random.randn(2000)) y = (t**5)+(0.5*np.random.randn(2000)) H, xedges, yedges = np.histogram2d(x,y, bins=(50,40)) xmesh, ymesh = np.meshgrid(xedges[:-1], yedges[:-1]) # Smooth the contours (if astropy is installed) if astro_smooth: kernel = Gaussian2DKernel(stddev=1.) H=convolve(H,kernel) fig,ax = plt.subplots(1, figsize=(7,6)) clevels = ax.contour(xmesh,ymesh,HT,lw=.9,cmap='winter')#,zorder=90) # Identify points within contours p = clevels.collections[0].get_paths() inside = np.full_like(x,False,dtype=bool) for level in p: inside |= level.contains_points(zip(*(x,y))) ax.plot(x[~inside],y[~inside],'kx') plt.show(block=False) 

enter image description here

0
source

You can achieve this with the many numpy / scipy / matplotlib tools:

  • Create scipy.spatial.KDTree source points for a quick search.
  • Use np.meshgrid to create a point grid with the resolution that is required for the outline
  • Use KDTree.query to mask all locations within the target density.
  • Insert data with a rectangular hopper or plt.hexbin .
  • Draw a path from the binned data, but use the mask from step 3. to filter areas of lower density.
  • Use the back of the mask to plt.scatter rest of the points.
+1
source

Source: https://habr.com/ru/post/955746/


All Articles