Randomized SVD for LSA \ LSI on Windows

I am working on a project that involves the use of covert semantic analysis (LSA). This requires the use of singular value decomposition (SVD), sometimes on large data sets. Is there an implementation of randomized SVD (rSVD) available for Windows \ Visual Studio? I have seen the redsvd project, but it seems to be supported only for Linux.

+6
source share
2 answers

ILNumerics may have this, but I have not seen if they do rSVD, and I have no personal experience with the library, but it is, fortunately, available through NuGet.

http://ilnumerics.net

Here are the documents for their implementation of SVD:

http://ilnumerics.net/apidoc/Index.html?topic=html/Overload_ILNumerics_ILMath_svd.htm

There is also a NAG, but it gets paid: http://www.nag.co.uk/numeric/numerical_libraries.asp

I also checked redsvd, and I bet I can either port it to C # for you, or at least get it to compile on windows. If they do not meet your requirements, let me know and I will consider the complexity of the port.

UPDATE:

Tonight he returned home and decided to give him a chance. Here's a very quick way to get redsvd to work on Windows using Visual Studio 2010. I posted it on github:

https://github.com/hoonto/redsvdwin

Open rsvd3.sln in Visual Studio, create it, and you will get the rsvd3.exe file in the Debug directory.

Run this:

C:\Users\MLM\Documents\Visual Studio 2010\Projects\redsvdwin\Debug>rsvd3.exe usage: redsvd --input=string --output=string [options] ... redsvd supports the following format types (one line for each row) [format=dense] (<value>+\n)+ [format=sparse] ((colum_id:value)+\n)+ Example: >redsvd -i imat -o omat -r 10 -f dense compuate SVD for a dense matrix in imat and output omat.U omat.V, and omat.S with the 10 largest eigen values/vectors >redsvd -i imat -o omat -r 3 -f sparse -m PCA compuate PCA for a sparse matrix in imat and output omat.PC omat.SCORE with the 3 largest principal components options: -i, --input input file (string) -o, --output output file prefix (string) -r, --rank rank (int [=10]) -f, --format format type (dense|sparse) See example. (string [=dense]) -m, --method method (SVD|PCA|SymEigen) (string [=SVD]) 

And here it is. By the way, this creates redsvdMain.cpp, if you want the Incr file with its main file, exclude redsvdMain.cpp and enable redsvdMainIncr.cpp. Since both have a primary in them, I simply excluded the Incr version and created the regular version.

In addition, I included the Eigen3 headers in the github repository and put them in an extra set to configure the solution, so you don’t have to bother with this at all.

Finally, I don’t know how cxxabi.h is for my knowledge in Visual Studio, so I did some cheats, you will see where I made the changes, because they will be commented like this:

 //MLM: commented next 3 //... //... //... //MLM: added 1 ... 

etc. Therefore, if you need to make adjustments, you will find out where my changes are.

+2
source

qr in ILNumerics has an overload of ILMath.qr (A, outR, outE, cost-effectiveness), which allows decomposition to be done with size savings.

+2
source

Source: https://habr.com/ru/post/946886/


All Articles