Why Eigen adaccency matrix values ​​are actually sentence estimates in Textrank

Here is the route for TextRank:

  • The document to be generalized, expressed as tf-idf matrix
  • (tf-idf matrix) * (tf-idf matrix) .Transpose = The adjacency matrix of some graph whose vertices are actually sentences of the above document
  • The page rank applied on this graph → returns the PR values ​​of each sentence

Now these PR values ​​are actually the eigenvalues ​​of this adjacency matrix.
What is the physical meaning or intuition behind this?

Why are Eigen values ​​actually series?

Here is the link for Page Rank: http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm

Here is an excerpt from the previous page:
PageRank or PR (A) can be calculated using a simple iterative algorithm and corresponds to the main eigenvector of the normalized matrix of links in the network.

Link for TextRank: https://joshbohde.com/blog/document-summarization

+4
source share
1 answer

To begin with, your question is a bit wrong. Eignevalues ​​is not an assessment. Rather, the records of the stationary eigenvector are estimates.

Textrank is working on a graphical approach. It has several options, but they have the following general steps:

  • , ( ), .

  • , , .

. , - , - . TF-IDF. , . , , TF-IDF , . , - , .

, P, j. - , ,

P x = x = 1 x.

, x , 1. -

+2

Source: https://habr.com/ru/post/1653388/


All Articles