MYSQL 中 exists 语句执行效率变低
在ORACLE 中,我们常常推荐使用exists 来替代in,往往也能取得比较好的优化效果。在ORACLE应用迁往MYSQL的过程中,我们发现部分in 的子查询语句带到MYSQL中,其执行效率变得非常低下,这很让人觉得匪夷所思。于是,我分析了一波。
对两个表,分别是一大一小进行关联查询:
mysql>selectcount(*)fromusers;+----------+|count(*)|+----------+|19|+----------+1rowinsetmysql>selectcount(*)fromorders;+----------+|count(*)|+----------+|86310|+----------+1rowinsetmysql>
开启profile,发现无论是子查询是大表还是小表 ,exists的语句总是比in执行慢:
mysql>showprofiles;+----------+------------+--------------------------------------------------------------------------------------------+|Query_ID|Duration|Query|+----------+------------+--------------------------------------------------------------------------------------------+|1|1.08661625|selectcount(1)fromordersowhereo.user_idin(selectu.idfromusersu)||2|1.56956275|selectcount(1)fromordersowhereexists(select1fromusersuwhereu.id=o.user_id)||3|0.81266425|selectcount(1)fromusersuwhereu.idin(selecto.user_idfromorderso)||4|8.4164905|selectcount(1)fromusersuwhereexists(select1fromordersowhereu.id=o.user_id)|+----------+------------+--------------------------------------------------------------------------------------------+4rowsinset
而查看exists语句的profile内容,发现其存在多个executing 和sending data过程,这是整个sql执行的主要耗时过程:
mysql>showprofileforquery2;+----------------------------+----------+|Status|Duration|+----------------------------+----------+|executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.5E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|1E-6||Sendingdata|1.3E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.7E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.1E-5||executing|2E-6||Sendingdata|1.2E-5||executing|1E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.1E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.2E-5||executing|2E-6||Sendingdata|1.5E-5||end|3E-6||queryend|3E-6||waitingforhandlercommit|1E-5||closingtables|9E-6||freeingitems|0.000152||cleaningup|1.7E-5|+----------------------------+----------+100rowsinset
而在in 子查询中,sending data这个过程只有一次,这也是整个sql执行主要耗时地方:
mysql>showprofileforquery1;+--------------------------------+----------+|Status|Duration|+--------------------------------+----------+|starting|9.3E-5||Executinghookontransaction|6E-6||starting|8E-6||checkingpermissions|5E-6||checkingpermissions|4E-6||Openingtables|0.004849||init|1.8E-5||Systemlock|1.4E-5||optimizing|1.4E-5||statistics|3.1E-5||preparing|2.2E-5||executing|3E-6||Sendingdata|1.081273|《《《《《《《《《《《《《《《《《《《|end|1.3E-5||queryend|3E-6||waitingforhandlercommit|1E-5||closingtables|5.2E-5||freeingitems|0.000171||cleaningup|2.9E-5|+--------------------------------+----------+
关于sending data和executing 解析:
Sending data
The thread is reading and processing rows for a SELECT statement, and sending data to the client. Because operations occurring during this state tend to perform large amounts of disk access (reads), it is often the longest-running state over the lifetime of a given query.
说Sending data 是:线程正在为一个select语句读取和处理行,并且发送数据到客户端。因为这期间操作倾向于大量的磁盘访问(读取),所以这常是整个查询周期中运行时间最长的阶段。(这是MySQL 5.5的解释,5.7的解释完全一样,但是5.7多了一个Sending to client状态)
这样就清楚了,Sending data 做了 读取,处理(过滤,排序等。。)和发送 三件事情,接下来再看该状态下的cpu 和 io 信息 以分析语句的瓶颈是 读取还是处理 ,再做相应的优化调整。。
executing
The thread has begun executing a statement.
在exists语句中不断地executing和Sending data 应该是在不断地扫描抓取数据进行匹配,那这应该与MYSQL 的算法有关,通过trace查看其实是在多次子查询 join_execution:
{"subselect_execution":{"select#":2,"steps":[{"join_execution":{"select#":2,"steps":[]/*steps*/}/*join_execution*/}]/*steps*/}/*subselect_execution*/},
想知道这是鸡肋吗,不知有没有相关的解析?
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。