How do you implement a system of "similar elements" for elements described by a set of tags?
There are three tables in my database: Article, ArticleTag and Tag. Each Article is linked to multiple tags through a many-to-many relationship. For each article, I want to find the five most similar articles to implement "if you like this article, you will like these too."
I am familiar with the similarity to cosine and using this algorithm works very well. But this is a way to slow down. For each article, I need to iterate over all the articles, calculate the cosine of similarity for a pair of products, and then select five articles with the highest similarity rating.
With 200 kilogram articles and 30 thousand tags, I need half a minute to calculate similar articles for one article. So I need another algorithm that gives about the same good results as cosine similarity, but which can be executed in real time and which does not require me to iterate over the entire document body every time.
Can someone suggest a ready-made solution for this? Most search engines that I looked at do not allow me to pick up a document search.
source share