Select 5 data points closest to the SVM hyperplan

I wrote Python code using Sklearn to cluster my dataset:

af = AffinityPropagation().fit(X)
cluster_centers_indices = af.cluster_centers_indices_
labels = af.labels_
n_clusters_= len(cluster_centers_indices)

I study the use of clustering queries and, thus, form a set of initial training data:

td_title =[]
td_abstract = []
td_y= []
for each in centers:
    td_title.append(title[each])
    td_abstract.append(abstract[each])
    td_y.append(y[each])

Then I train my model (SVM) on it:

clf = svm.SVC()
clf.fit(X, data_y)

I want to write a function that sets the centers, model, X values ​​and Y values ​​will add 5 data points that are most unsure of the model, i.e. Data points closest to the hyperplane. How can i do this?

+4
source share
1 answer

, "" (5) , SVM ". scikit decision_function . , argsort, " / N ".

scikit, closestN, , .

import numpy as np

def closestN(X_array, n):
    # array of sample distances to the hyperplane
    dists = clf.decision_function(X_array)
    # absolute distance to hyperplane
    absdists = np.abs(dists)

    return absdists.argsort()[:n]

scikit, :

closest_samples = closestN(X, 5)
plt.scatter(X[closest_samples][:, 0], X[closest_samples][:, 1], color='yellow')

enter image description here

,

enter image description here

- , somelist.append(closestN(X, 5)). , - somelist.append(X[closestN(X, 5)]).

closestN(X, 5)
array([ 1, 20, 14, 31, 24])

X[closestN(X, 5)]
array([[-1.02126202,  0.2408932 ],
       [ 0.95144703,  0.57998206],
       [-0.46722079, -0.53064123],
       [ 1.18685372,  0.2737174 ],
       [ 0.38610215,  1.78725972]])
+5

Source: https://habr.com/ru/post/1665606/


All Articles