Fast approximation methods for the highest 3 eigenvalues โ€‹โ€‹and eigenvectors of a large symmetric matrix

I am writing code to compute Classic multidimensional scaling (MDS for short) of very large n from n matrices, n = 500,000 in my example.

In one MDS step, I need to calculate the highest three eigenvalues โ€‹โ€‹and their corresponding eigenvectors n on an n matrix. This matrix is โ€‹โ€‹called the matrix B I need only these three eigenvectors and eigenvalues. General methods for calculating eigenvectors and eigenvalues โ€‹โ€‹of a large matrix take a lot of time, and I do not require a very accurate answer, so I am looking for an estimate of eigenvectors and eigenvalues.

Some options:

  • Matrix B symmetric , real and fairly dense
  • The expansion in eigenvalues โ€‹โ€‹of B in theory should always give real numbers.
  • I do not require an absolutely accurate assessment, just quick. I need it to complete in a few hours.
  • I write in python and C ++

My question is: are there fast methods for estimating the three highest eigenvectors and eigenvalues โ€‹โ€‹of such a large matrix B ?

My achievement: I found a method of approaching the highest eigenvalue of the matrix , but I do not know if I can generalize it to the highest three. I also found this document written in 1996 , but for me it is extremely technical and difficult.

+6
source share
3 answers

G. The mathematical calculations of Golub and CF Van Loan 2 in Chapter 9 state that Lanczos algorithms are one of the options for this (except that the matrix should ideally be sparse), it obviously works for unsharp ones)

https://en.wikipedia.org/wiki/Lanczos_algorithm

+8
source

You can get the highest eigenvector B and then convert the data to B' using this eigenvector. Then place the first column B' and get B'' so that you can get the highest eigenvector B'' : enough information to make a plausible second most senior vector for B And then for the third.

About speed: you can randomly try out this huge data set for data set N . If you get only three dimensions, I hope you can also get rid of most of the data to get an overview of eigenvectors. You can call it: "polling." I cannot help you measure the error rate, but I will try to select 1k elements several times and see if the results are more or less the same.

Now you can get the average of several polls to build a โ€œpredictionโ€.

+2
source

Take a look at the suggestions in this thread.

Largest eigenvalues โ€‹โ€‹(and corresponding eigenvectors) in C ++

As suggested, you can use the ARPACK package, which has a C ++ interface.

0
source

Source: https://habr.com/ru/post/1012656/


All Articles