Hot topics: 1-word terms and term terms

With your perfect help here, I’ve already figured out how to calculate trending topics (standard rating + floating average).

My next problem: I have terms (1-3 words) in my database related to the time they were mentioned. But trending topics always consist only of words with 1 word, since one part of the term is ALWAYS mentioned more often than the full term. Example: Yesterday 3 news articles were about “Barack Obama” and today 148. Then “Barack Obama” grows, of course. But the "Barack" is also growing, and therefore it is a trend.

How to include word length when calculating trend topics? I do not want to use another algorithm, I am completely satisfied with the algorithm above. Can I multiply the score of all two-word terms with 1.5 or so?

Detailed example: My main trends: Microsoft, China, Hillary Clinton, Dallas Mavericks. I wanted to say that “Hillary Clinton” and “Dallas Mavericks” never occupy a single or a single 2, because these are two-word terms. “Microsoft” and “China” are single-word words, so they are always rated better. Is there any way to solve this problem?

I hope you help me. Thanks in advance!

+3
source share
2 answers

Speaking of Obama, yes you can. :)

, , . :

:

  • Air France
  • A330
  • ...

, (, 100 ), , , , 50% , . (, 150 , , - 110, 10 , 100 .)

" " "", "", 100%, :

  • Air France
  • A330
  • ...

, , , .

:

, , , , (, " " - " " + 0,5 * "" + 0,5 * "" ).

+1

@subtenante, , , , " " "", ""...
, :

"Barack"s + "Obama"s - "Barack Obama"s

... , , , , "" " " ( " " ).

0

Source: https://habr.com/ru/post/1709600/


All Articles