Aman King

GROUP BY in MyS...Control PanelChange LogBrowse PagesSearch?

GROUP BY in MySQL does implicit ordering

It may not be obvious, and in most cases, it may not matter but sometimes it is helpful to know that GROUP BY in MySQL does implicit ordering according to the grouped columns.

Let's take an example. Suppose we have a table like so:

mysql> SELECT * FROM users;
+----+--------+------+
| id | name   | age  |
+----+--------+------+
|  1 | donald |   27 | 
|  2 | mickey |   20 | 
|  3 | minnie |   23 | 
|  8 | goofy  |   30 | 
|  5 | daisy  |   20 | 
|  6 | jane   |   20 | 
|  7 | bruce  |   30 | 
|  9 | clark  |   35 | 
| 10 | kit    |   25 | 
| 11 | peter  |   30 | 
+----+--------+------+

Now let's say we want to list the 5 most recent users (for simplicity, let's say the id represents when the entry was made... please note that in real projects, one should avoid depending on database ids for business logic):

mysql> SELECT * FROM users ORDER BY id DESC LIMIT 5;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
| 11 | peter |   30 | 
| 10 | kit   |   25 | 
|  9 | clark |   35 | 
|  8 | goofy |   30 | 
|  7 | bruce |   30 | 
+----+-------+------+

Now, let's say we want only a single representative for a given age, and if there are multiple, the most recently joined user wins... this is where we bring in GROUP BY to perform a distinction:

mysql> SELECT u.* FROM users u INNER JOIN 
       (SELECT max(id) max_id FROM (SELECT * FROM users ORDER BY id DESC LIMIT 5) recent_users GROUP BY age) t 
       ON u.id = t.max_id;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
| 10 | kit   |   25 | 
| 11 | peter |   30 | 
|  9 | clark |   35 | 
+----+-------+------+

We can see that the ordering is now implicitly done on age which is the column mentioned in GROUP BY. Supposing we were still interested in the ordering to be recent user first:

mysql> SELECT u.* FROM users u INNER JOIN 
       (SELECT max(id) max_id FROM (SELECT * FROM users ORDER BY id DESC LIMIT 5) recent_users GROUP BY age) t 
       ON u.id = t.max_id ORDER BY id DESC;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
| 11 | peter |   30 | 
| 10 | kit   |   25 | 
|  9 | clark |   35 | 
+----+-------+------+

This gives us the correct result but the extra ordering step is unnecessary (as the nested query already orders the results), and can be avoided by ORDER BY NULL (this simply avoids the implicit ordering that GROUP BY does):

mysql> SELECT u.* FROM users u INNER JOIN 
       (SELECT max(id) max_id FROM (SELECT * FROM users ORDER BY id DESC LIMIT 5) recent_users GROUP BY age ORDER BY NULL) t 
       ON u.id = t.max_id;
+----+-------+------+
| id | name  | age  |
+----+-------+------+
| 11 | peter |   30 | 
| 10 | kit   |   25 | 
|  9 | clark |   35 | 
+----+-------+------+

Comments

Talk
Tags: technology:database Last modified 09:36 Mon, 7 Sept 2009 by AmanKing. Accessed 566 times Children What Links Here share Share Except where expressly noted, this work is licensed under a Creative Commons License.