京東一面:MySQL 中的 distinct 和 group by 哪個效率更高?
在語義相同,有索引的情況下:group by和distinct都能使用索引,效率相同。 在語義相同,無索引的情況下:distinct效率高于group by。原因是distinct 和 group by都會進(jìn)行分組操作,但group by可能會進(jìn)行排序,觸發(fā)filesort,導(dǎo)致sql執(zhí)行效率低下。
為什么在語義相同,有索引的情況下,group by和distinct效率相同? 在什么情況下,group by會進(jìn)行排序操作?
1
SELECT DISTINCT columns FROM table_name WHERE where_conditions;mysql> select distinct age from student;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.01 sec)SELECT DISTINCT column1,column2 FROM table_name WHERE where_conditions;mysql> select distinct sex,age from student;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.02 sec)SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;mysql> select age from student group by age;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.02 sec)SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;mysql> select sex,age from student group by sex,age;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.03 sec)mysql> select sex,age from student group by sex;
+--------+-----+
| sex | age |
+--------+-----+
| male | 10 |
| female | 12 |
+--------+-----+
2 rows in set (0.03 sec)mysql> explain select int1_index from test_distinct_groupby group by int1_index;
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)
mysql> explain select distinct int1_index from test_distinct_groupby;
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)mysql> explain select int6_bigger_random from test_distinct_groupby GROUP BY int6_bigger_random;
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
| 1 | SIMPLE | test_distinct_groupby | NULL | ALL | NULL | NULL | NULL | NULL | 97402 | 100.00 | Using temporary; Using filesort |
+----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+
1 row in set (0.04 sec)https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
GROUP BY 默認(rèn)隱式排序(指在 GROUP BY 列沒有 ASC 或 DESC 指示符的情況下也會進(jìn)行排序)。然而,GROUP BY進(jìn)行顯式或隱式排序已經(jīng)過時(deprecated)了,要生成給定的排序順序,請?zhí)峁?ORDER BY 子句。
https://dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
從前(Mysql5.7版本之前),Group by會根據(jù)確定的條件進(jìn)行隱式排序。在mysql 8.0中,已經(jīng)移除了這個功能,所以不再需要通過添加order by null 來禁止隱式排序了,但是,查詢結(jié)果可能與以前的 MySQL 版本不同。要生成給定順序的結(jié)果,請按通過ORDER BY指定需要進(jìn)行排序的字段。
在語義相同,有索引的情況下:
在語義相同,無索引的情況下:
group by語義更為清晰 group by可對數(shù)據(jù)進(jìn)行更為復(fù)雜的一些處理
??
評論
圖片
表情
