How to implement "related articles?"

How to write code that finds related (similar) articles with what the user is reading now?

For example, suppose I have articles:

Python programming tips
Python programming for newbies
Programming in Python, ActionScript and Flash
Programming in the Jungle
Tarzan saves newbie Judy from using Fortran programming language

(I came up with these headings right now.)

How can I query the database and find that they are all related?

I would be grateful for any suggestions.

Thanks, Boda Sido.

+3
source share
5 answers

This book contains some tips for this; more specifically, it sounds like a Collaborative Filtering issue .

. - , , , ex.

, Google. , , , , , .

+2

? " " , MySQL . Google .

+1

tf -idf

- - , ( ), , tf-idf.
tf-idf , (term frequency - tf), ( - idf).

+1
source

If your case is indeed a content-oriented website, then it is probably best to contact the editors to add tags to each article. This is how it is done all over the Internet (e.g. Wordpress).

In addition, there may be ways to do this using language processing, but since you are using Python, I will leave this to the people who are python experts ...

0
source

One suggestion is to add tags to all your articles. Related articles are those that have similar tags.

0
source

Source: https://habr.com/ru/post/1735039/


All Articles