Function selection using Lasso with scikit-learn

I want to make some kind of function selection using the python and scikit-learn libraries.

As I know, Lasso regression can be used to select a function, for example, for one-dimensional selection.

My simple data set is as follows.

G1 G2 G3 ... GN Class 1.0 4.0 5.0 ... 1.0 X 4.0 5.0 9.0 ... 1.0 X 9.0 6.0 3.0 ... 2.0 Y ... 

I want to find top-N (Gs) attributes that can greatly affect a class, with lasso regression. Although I have to handle the parameters, lasso regression can be applied like this.

 lasso = Lasso() # A = list of [G1, G2, ..., GN], B = [X, X, Y, ...] lasso.fit(A, B) print (lasso.coef_) 

Is it right to decide that an attribute is more associated with the class if it has a higher lasso.coef_ value? Also, I want to know if there is any rule for selecting top-N genes using regression. If I use PCC, a P value such as .05 can be used as a threshold value for selection, but I don’t know how to deal with Lasso. Can someone give me an idea please?

+5
source share

Source: https://habr.com/ru/post/1237649/


All Articles