The most efficient way to find a database with over a billion records?

My client has a huge database containing only three fields:

  • Primary Key (unsigned number)
  • Title (verbose text)
  • Description (up to 1000 varchar)

There are several billion records left in this database. I have no experience with such large amounts of data.

He wants me to develop an interface using AJAX (e.g. Google) to search for this database. My queries are as slow as a turtle.

What is the best way to search for text fields in such a large database? If a user types spelling incorrectly on an interface, how can I return what he wanted?

+6
source share
3 answers

If you use FULLTEXT indexes, you write your queries correctly, and the speed at which the results are returned is not adequate, you enter a territory where MySQL may just not be enough for you.

Perhaps you can configure the settings, purchase enough RAM to make sure that the entire data set corresponds to 100% in memory. It is definitely true that performance can be huge.

I definitely recommend exploring your mysql settings. We had some dumb settings in the past. Operating systems by default tend to really suck!

However, if you have problems at this point, you can:

  • Create a separate table containing each word (indexed) along with the identifier of the record to which it refers. This will allow you to search for individual words.
  • Use a different system that is optimized to solve this problem. If my information is not out of date, then the 2 most popular engines for solving this problem are:
    • Sphinx
    • Solr / lucene
+7
source

You can not. The only quick search in your script is the main key, as it will most likely be an index. Text search is slow, like a turtle.

In all seriousness, you have several solutions:

If you need to stick with NoSQL, you have to redesign the schema. It is hard to give you a good recommendation without knowing the requirements. One solution would be to index keywords in a separate table.

Another solution is to switch to another search engine, you can find suggestions on other issues here, for example: Quick SQL Server search in 40M text records

0
source

If your table is myISAM, you can set the Name and Description fields to FULLTEXT

CREATE TABLE articles ( id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY, Name VARCHAR(200), Description TEXT, FULLTEXT (Name,Description) ); 

Then you can use queries such as:

 SELECT * FROM articles WHERE MATCH (Name,Description) AGAINST ('database'); 

You can find more information at http://docs.oracle.com/cd/E17952_01/refman-5.0-en/fulltext-search.html

Before doing any of the above, you may want to back up (or at least make a copy) of your database.

0
source

Source: https://habr.com/ru/post/948026/


All Articles