You can get the highest eigenvector B
and then convert the data to B'
using this eigenvector. Then place the first column B'
and get B''
so that you can get the highest eigenvector B''
: enough information to make a plausible second most senior vector for B
And then for the third.
About speed: you can randomly try out this huge data set for data set N
. If you get only three dimensions, I hope you can also get rid of most of the data to get an overview of eigenvectors. You can call it: "polling." I cannot help you measure the error rate, but I will try to select 1k elements several times and see if the results are more or less the same.
Now you can get the average of several polls to build a โpredictionโ.
source share