Using prepared data to classify a Sci-kit

I am trying to use the Sci-kit learn python library to classify a bunch of URLs for certain keywords matching user profile. The user has a name, email address ... and the address assigned to them. I created a txt with the result of each match of the profile data for each link, so it is in the format:

Name  Email  Address
  0     1      0      =>Relavent
  1     1      0      =>Relavent
  0     1      1      =>Relavent
  0     0      0      =>Not Relavent

If a value of 0 or 1 means that the attribute was found on the page (each line is a web page) How to transfer this data to the sci-kit so that it can use it to start the classifier? In the examples I saw, there is data coming from a predefined sch-kit library, such as numbers or aperture, or generated in a format that I already have. I just don’t know how to use the data format that I have to provide the library

The above game example, and I have many more functions than 3

+4
source share
1 answer

The required data is an numpyarray (in this case, the "matrix") with the form (n_samples, n_features).

csv-file numpy.genfromtxt. . .

csv ( file.csv ):

a,b,c,target
1,1,1,0
1,0,1,0
1,1,0,1
0,0,1,1
0,1,1,0

,

data = np.genfromtxt('file.csv', skip_header=True)

skip_header True, ( a,b,c,target). . numpy.

. - () ( ).

( ) ( ),

features = data[:, :3]
targets = data[:, 3]   # The last column is identified as the target

CSV :

features = array([[ 0, 1, 0],
              [ 1, 1, 0],
              [ 0, 1, 1],
              [ 0, 0, 0]])  # shape = ( 4, 3)

targets = array([ 1, 1, 1, 0])  # shape = ( 4, )

- fit. svm-,

>>> from sklearn.svm import LinearSVC
>>> linear_svc_model = LinearSVC()
>>> linear_svc_model.fit(X=features, y=targets) 
+3

Source: https://habr.com/ru/post/1524801/


All Articles