Word Filtering

For the project I'm working on, I show the tweets that I receive from the Streaming Stream API. Before displaying a tweet, I need to check each word for a list of blacklisted words.

Currently, I have all the blacklists in the MongoDB collection.

The obvious way that comes to my mind is to blow up a tweet to get every word, and then for every word in a tweet, check if the blacklist collection contains that word.

However, this would mean ~ 20 database tweet requests that I am showing.

Is there a better way to do this?

0
source share
1 answer

I would take all the blacklisted words from the database, save them inside the variable as a string (separated by | ), and use preg_match() to find out if there are any tweets.

 $blacklist = 'blacklisted|words'; if (preg_match('/\b(' . $blacklist . ')\b/i', $tweet)) { // Don't show } else { // Show the tweet } 
+1
source

Source: https://habr.com/ru/post/919552/


All Articles