How to run a query to find a string in blob files?

Mediawiki has a table in the text of the database that contains the contents of the page. It is saved as a [BLOB] file. I would like to run a query to search the entire text on the site to see which pages contain a specific string. How to run a query to search for [blob] files?

+6
source share
2 answers

Mediawiki markup text is stored in the old_text field, which is a mediumblob type. You can request it, like any other text field. MySQL will output your string to the binary for the query. Please note that this is a case sensitive search!

 select old_id from text where old_text like "%string%"; 

If you need case insensitivity, you need to apply the appropriate character set with case insensitivity:

 SELECT old_id from text where CONVERT(old_text USING latin1) like '%STRing%'; 

Remember that if your table is small, these queries will take a lot of time.

+6
source

According to the multimedia documentation, only text for verification is stored in the text table. Therefore, in order to access the full text, all changes corresponding to the page must be processed. It is better to use an API call for a search engine for media keys and process the results, rather than search using an SQL query.

0
source

Source: https://habr.com/ru/post/956629/


All Articles