PLS-DA algorithm in python

Question

PLS-DA algorithm in python

The partial least squares (PLS) algorithm is implemented in the scikit-learn library, as described here: http://scikit-learn.org/0.12/auto_examples/plot_pls.html In the case where y is a binary vector, a variant of this algorithm is used, Partial Lesscrimlections Discriminant Analysis (PLS-DA) algorithm. Does the PLSRegression module in sklearn.pls also implement this binary case? If not, where can I find a python implementation for it? In my binary case, I'm trying to use PLSRegression:

pls = PLSRegression(n_components=10) pls.fit(x, y) x_r, y_r = pls.transform(x, y, copy=True)

In the conversion function, the code gets an exception on this line:

 y_scores = np.dot(Yc, self.y_rotations_)

The error message "ValueError: matrices are not aligned." Yc is the normalized vector y, and self.y_rotations_ = [1.]. In the fit function, self.y_rotations_ = np.ones (1), if the source y is a one-dimensional vector (y.shape 1 = 1).

+10

python scikit-learn

Noam peled Aug 22 '13 at 20:23

source share

3 answers

You can use the linear discrimination analysis package in SKLearn, it will take integers for the y value:

LDA-SKLearn

Here's a quick guide to using LDA: Sklearn LDA Tutorial

+1

Kyle54 Aug 27 '19 at 2:15

source share

Not quite what you are looking for, but you are checking these two threads on how to call your own (c / C ++ code) from the python and C ++ PLS libs implementation:

Partial Least Squares Library

C / C ++ call from python?

you can use boost.python to insert c ++ code in python. Here is an example taken from the official site :

Following the C / C ++ tradition, let's start with hello world. C ++ function:

 char const* greet() { return "hello, world"; }

can be displayed in Python by writing a Boost.Python shell:

 #include <boost/python.hpp> BOOST_PYTHON_MODULE(hello_ext) { using namespace boost::python; def("greet", greet); }

What is it. Were made. Now we can create this as a shared library. The resulting DLL is now visible to Python. Here is an example Python session:

 >>> import hello_ext >>> print hello_ext.greet() hello, world

0

0x90 Aug 23 '13 at 20:23

source share

markcelo · Accepted Answer · 2014-02-09T21:48:15+0000

PLS-DA is really a "trick" to use PLS for categorical results instead of the usual continuous vector / matrix. The trick is to create a fictitious identical matrix of zeros / ones that represents membership in each of the categories. Therefore, if you have a predicted binary result (i.e. Male / Female, Yes / No, etc.), your dummy matrix will have two columns representing membership in any category.

For example, consider the gender outcome for four people: 2 men and 2 women. The dummy matrix should be encoded as follows:

 import numpy as np dummy=np.array([[1,1,0,0],[0,0,1,1]]).T

where each column represents membership in two categories (male, female)

Then your model for the data in the Xdata variable (form 4 rows, arbitrary columns) will be:

 myplsda=PLSRegression().fit(X=Xdata,Y=dummy)

Predicted categories can be extracted from comparing two indicator variables in mypred:

 mypred= myplsda.predict(Xdata)

For each row / case, the predicted gender is such that it has the highest predicted membership.

PLS-DA algorithm in python

More articles: