Lucene How to build a term-doc matrix


I need to build this matrix, but I cannot find a way to calculate the normalized tf-idf for each cell. The normalization that I would do is cosine normalization, which divides tf-idf (calculated using DefaultSimilarity ) for 1 / sqrt (sumOfSquaredtf-idf in the column).

Does anyone know a way to accomplish this?
Thanks in advance Antonio

+3
source share
1 answer

, Lucene, Sujit Pal. Lucene, , idf, tf.

+1

Source: https://habr.com/ru/post/1786855/


All Articles