An alternative to ORDER BY RAND() for MySQL

Posted in MySql and PHP -

I've posted previously about how to randomly order a resultset with MySQL using RAND() but the issue with RAND() is it will be inefficient on large tables because each row needs to have the random number calculated before the resultset can be ordered. This post looks at an alternative which requires two queries but will be much more efficient for large tables.

Please note

I have written a later post titled "MySQL's LIMIT syntax can be slow for large offsets". While using LIMIT syntax as shown in this post is a lot faster than ORDER BY RAND() it's still slow for larger tables. I'm currently working on a better alternative.

The alternative, and a note about INNODB vs MyISAM tables

The alternative suggested in this post uses COUNT(*) first to get the number of records in the table and then picks the record by using MySQL's LIMIT syntax. Note that INNODB does not cache the count of a table like MyISAM does so it takes slightly longer to return the count.

Example table

I often use an example table containing fruit. Here's the output from SELECT * FROM fruit:

+----------+--------+-----------+
| fruit_id | name   | somevalue |
+----------+--------+-----------+
|        1 | Banana |         2 |
|        2 | Orange |         4 |
|        3 | Cherry |         3 |
|        4 | Apple  |         1 |
+----------+--------+-----------+

MySQL only solution

It's possible to do this entirely with MySQL SQL queries without having to run code in PHP or other programming language. The following SQL queries first gets a count from the table, then selects a random offset based on that count. It then prepares a statement so the calculated offset can be used and executes the statement. Note that the offset is cast as a signed integerl without this you'll get the error message "ERROR 1210 (HY000): Incorrect arguments to EXECUTE".

SELECT @count := COUNT(*) FROM fruit;
SET @offset = CONVERT(FLOOR(RAND() * @count), SIGNED);
PREPARE mystatement FROM "SELECT * FROM fruit LIMIT ?, 1";
EXECUTE mystatement USING @offset;
DEALLOCATE PREPARE mystatement;

The output from the above will result in a random record returned each time e.g.:

+----------+--------+-----------+
| fruit_id | name   | somevalue |
+----------+--------+-----------+
|        3 | Cherry |         3 |
+----------+--------+-----------+

This works from the MySQL command line but doesn't appear to work in tools like phpMyAdmin (which does successfully execute the SQL but doesn't output any data) or MySQL Query Browser. It does work programatically from PHP (and therefore will for other programming languages). For exmample this:

mysql_query('SELECT @count := COUNT(*) FROM fruit');
mysql_query('SET @offset = CONVERT(FLOOR(RAND() * @count), SIGNED)');
mysql_query('PREPARE mystatement FROM "SELECT * FROM fruit LIMIT ?, 1"');
$res = mysql_query('EXECUTE mystatement USING @offset');
$row = mysql_fetch_assoc($res);
print_r($row);

outputs this:

Array
(
    [fruit_id] => 1
    [name] => Banana
    [somevalue] => 2
)

Using a programming language

Doing this outside MySQL is done in a similar way by getting the count from the table first, working out a random offset and then running a second query to get the record. This example uses PHP:

$res = mysql_query("SELECT COUNT(*) FROM fruit");
$row = mysql_fetch_array($res);
$offset = rand(0, $row[0]-1);

$res = mysql_query("SELECT * FROM fruit LIMIT $offset, 1");
$row = mysql_fetch_assoc($res);

Conclusion

My example only uses a small table and in this instance would be easier to simply use "ORDER BY RAND()" but ordering in this way is inefficient and performance will start to suffer as the table grows. Instead it can be more efficient to run a couple of queries instead to work out an offset based on the number of records in the table, and then use MySQL's LIMIT syntax to pull that record out of the table as shown in this post.



Related posts:


Comments