Good idea / bad idea? Using MySQL RAND () outside of a small set of subquery results?

So, in MySQL, I read that for large tables with lots of rows, using ORDER BY RAND () is a bad idea (even with ~ 500 row tables, presumably). Slow and inefficient. Lots of line scans.

What does it look like (below) for an alternative?

SELECT * FROM (... a subquery that usually returns a set of less than 20 rows ...) ORDER BY RAND () LIMIT 8

Instead of using RAND () for a large dataset, I would select a small subset, and only then would I use RAND () for the returned rows. In 99.9% of all cases, the subquery specified above should select less than 20 rows (and in fact it is usually less than 8).

Curious to hear what people think.

(Just for reference, I am making my MySQL stuff with PHP.)

Thanks!

+4
source share
3 answers

Actually ... I finished the test, and I may have answered my question. I thought I posted this information here if it were useful to someone else. (If I did something wrong, please let me know!)

It is amazing...

Unlike everything I read, I created a TestData table with 1 million rows and ran the following query:

SELECT * FROM TestData WHERE number = 41 ORDER BY RAND () LIMIT 8

... and he returned the rows on average 0.0070 seconds. I really don't understand why RAND () has such a bad reputation. I think this is useful to me, at least in this particular situation.

I have three columns in my table:

id [BIGINT (20)] | text field [tinytext] | number [BIGINT (20)]

Primary key by identifier, index by number.

I think MySQL is smart enough to know that it should only apply RAND () to the 20 rows that return "WHERE number = 41"? (I specifically added only 20 lines, which had a value of 41 for the "number".)

An alternative subquery method returns results with an average time of about 0.0080 seconds, which is slower than a method without a subquery.

Subquery Method: SELECT * FROM (SELECT * FROM TestData WHERE number = 41) as t ORDER BY RAND () LIMIT 8

+8
source

You seem to be on the right track. One of the best ways to be more efficient in using MySQL is to limit your datasets with master queries.

0
source

I recently posted this article about a problem: http://www.electrictoolbox.com/mysql-random-order-random-value/ , but I donโ€™t really like adding another column to my data.

0
source

Source: https://habr.com/ru/post/1335792/


All Articles