一个left join SQL 简单优化分析
有个关联查询的sql,需要2秒多,于是进行查看一番:
SELECTa.id,a.brand_id,a.series_id,a.product_id,a.material_id,a.custom_category_id,a.price,a.product_url,a.organ_id,.....FROMpm_brand_xxxxaLEFTJOINpm_brand_yyyyydONa.series_id=d.idWHEREa.is_delete=0ANDd.is_delete=0ANDa.organ_id='Cxxx'ANDa.brand_id=6491603ANDd.brand_id=6491603ANDa.model_flag=14;
mysql>showprofileforquery4;+----------------------+----------+|Status|Duration|+----------------------+----------+|starting|0.000072||checkingpermissions|0.000002||checkingpermissions|0.000002||Openingtables|0.000011||init|0.000026||Systemlock|0.000007||optimizing|0.000016||statistics|0.000142||preparing|0.000018||executing|0.000002||Sendingdata|2.281192|<<<<<<<执行的主要时间消耗|end|0.000007||queryend|0.000011||closingtables|0.000011||freeingitems|0.000030||loggingslowquery|0.000003||loggingslowquery|0.000102||cleaningup|0.000022|+----------------------+----------+
+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+|id|select_type|table|partitions|type|possible_keys|key|key_len|ref|rows|filtered|Extra|+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+|1|SIMPLE|d|NULL|ref|PRIMARY,idx_pm_yyyy_bid|idx_pm_yyyyy_bid|9|const|1|10.00|Usingwhere||1|SIMPLE|a|NULL|index_merge|idx_pm_xxxx_sid,idx_pm_xxx_bid,idx_pm_brand_xxxx_organ|idx_pm_xxx_organ,idx_pm_brand_xxxx_bid|99,9|NULL|11314|0.04|Usingintersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid);Usingwhere;Usingjoinbuffer(BlockNestedLoop)|+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------+-------+----------+------------------------------------------------------------------------------------------------------------------------------------------------+2rowsinset,1warning(0.00sec)
从执行计划来看,d表是做了驱动表,a做了被驱动表
d表 type = ref ,使用非唯一性索引或者唯一索引的前缀扫描,返回匹配某个单独值的记录行,这里使用了索引idx_pm_yyyyy_bid,该索引正是brand_id上的索引,
即是说,在和a表的关联中d先通过brand_id来查找记录行,再通过相应记录的id去和a表的series_id做匹配。
我查看相应的记录数,发现a表145万的大表,d表是4075的小表。
a表
mysql> select count(*) from pm_xxxxxx;
+----------+
| count(*) |
+----------+
| 1459777 |
+----------+
1 row in set (0.27 sec)
d表:
mysql> select count(*) from pm_yyyyyy;
+----------+
| count(*) |
+----------+
| 4075 |
+----------+
1 row in set (0.00 sec)
而 a表是type=index_merge 索引合并,这里走了idx_pm_xxx_organ(organ_id),idx_pm_brand_xxxx_bid(brand_id) ,extra 是
Using intersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid); Using where; Using join buffer (Block Nested Loop)
Using intersect正说明了这里使用了(idx_pm_xxxxx_organ,idx_pm_xxxx_bid)的交集
Using where 是用model_flag等这些其他条件的过滤
Using join buffer (Block Nested Loop) 说明使用BNL的算法进行匹配
BNL 算法是将外层循环的行/结果集(驱动表)存入join buffer, 内层循环的每一行与整个buffer中的记录做比较,从而减少内层循环的次数.
举例来说,外层循环的结果集是100行,使用NLJ 算法需要扫描内部表100次,如果使用BNL算法,先把对Outer Loop表(外部表)每次读取的10行记录放到join buffer,然后在InnerLoop表(内部表)中直接匹配这10行数据,内存循环就可以一次与这10行进行比较, 这样只需要比较10次,对内部表的扫描减少了9/10。所以BNL算法就能够显著减少内层循环表扫描的次数.
在这里就是d表中取得结果集分批放入buffer中与a表进行匹配。
而这个语句无论如何都要2秒中,也在我们的认识中小表驱动大表并没错,我的猜想应该就是在进行BNL时消耗了时间,表现到过程中就是 Sending data 的时间消耗增多。
吐槽的是mysql中貌似没有什么办法来多方面看查询消耗了。
我想到的是如果该表现有sql关联的顺序是否性能能改善,在该sql中,我发现了两个条件:
AND a.brand_id = 6491603
AND d.brand_id = 6491603
在业务逻辑上这两个表的字段应该是一致的,如果我将d表的d.brand_id = 6491603去掉,以上的执行计划应该会改变,于是去掉之后执行,执行时间非常小。
mysql>showprofileforquery1;+----------------------+----------+|Status|Duration|+----------------------+----------+|starting|0.000080||checkingpermissions|0.000002||checkingpermissions|0.000002||Openingtables|0.000012||init|0.000030||Systemlock|0.000006||optimizing|0.000014||statistics|0.000130||preparing|0.000016||executing|0.000001||Sendingdata|0.027325||end|0.000003||queryend|0.000015||closingtables|0.000005||freeingitems|0.000014||cleaningup|0.000009|+----------------------+----------+16rowsinset,1warning(0.00sec)
看其执行计划:+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------------------------+-------+----------+---------------------------------------------------------------------------------------------------------+|id|select_type|table|partitions|type|possible_keys|key|key_len|ref|rows|filtered|Extra|+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------------------------+-------+----------+---------------------------------------------------------------------------------------------------------+|1|SIMPLE|a|NULL|index_merge|idx_pm_xxxxx_sid,idx_pm_xxxxx_bid,idx_pm_xxxx_organ|idx_pm_xxxxx_organ,idx_pm_xxxx_bid|99,9|NULL|11315|1.00|Usingintersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid);Usingwhere||1|SIMPLE|d|NULL|eq_ref|PRIMARY|PRIMARY|8|xxxx.a.series_id|1|10.00|Usingwhere|+----+-------------+-------+------------+-------------+---------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------+---------+-------------------------+-------+----------+---------------------------------------------------------------------------------------------------------+2rowsinset,1warning(0.00sec)
发现变成了a表做驱动表,d表做被驱动表,从extra列看
a表是Using intersect(idx_pm_xxxxx_organ,idx_pm_xxxx_bid); Using where 依然是使用索引合并,where条件来取结果,使用了idx_pm_xxxxx_organ,idx_pm_xxxx_bid 连个索引。
d表走PRIMARY 主键索引,从ref列来看是通过a表的series_id 来关联,这样效率表提升了。
需要说的一点是,小结果集并不代表就是小表,大表也可以有小结果集,当大表用来被匹配并被扫描多次,自然效率并不高.
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。