How numpy indexing works in this scenario

How does numpy physical indexing numpy get data from the "data" variable in the code snippet below? I understand that the first parameter is the x coordinate, and the second parameter is the y coordinate. I'm not sure how it maps to data points from a variable.

data = vstack((rand(150,2) + array([.5,.5]),rand(150,2))) # assign each sample to a cluster idx,_ = vq(data,centroids) # some plotting using numpy logical indexing plot(data[idx==0,0],data[idx==0,1],'ob', data[idx==1,0],data[idx==1,1],'or') plot(centroids[:,0],centroids[:,1],'sg',markersize=8) 
+4
source share
1 answer

All this in the figures:

 In [89]: data.shape Out[89]: (300, 2) # data has 300 rows and 2 columns In [93]: idx.shape Out[93]: (300,) # idx is a 1D-array with 300 elements 

idx == 0 is a boolean array with the same form as idx . This is True , where the element in idx is 0 :

 In [97]: (idx==0).shape Out[97]: (300,) 

When you index data with idx==0 , you get all rows of data , where idx==0 - True:

 In [98]: data[idx==0].shape Out[98]: (178, 2) 

When indexing using the tuple data[idx==0, 0] first data axis is indexed with the boolean array idx==0 , and the second data axis is indexed with 0 :

 In [99]: data[idx==0, 0].shape Out[99]: (178,) 

The first axis of data corresponds to rows, the second to columns. This way you only get the first column data[idx==0] . Since the first column of data is x values, this gives those x values ​​in data , where idx==0 .

+4
source

Source: https://habr.com/ru/post/1493867/


All Articles