怎么在Mysql中利用join优化sql
怎么在Mysql中利用join优化sql?针对这个问题,这篇文章详细介绍了相对应的分析和解答,希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。
0. 准备相关表来进行接下来的测试
user1表,取经组+----+-----------+-----------------+---------------------------------+|id|user_name|comment|mobile|+----+-----------+-----------------+---------------------------------+|1|唐僧|旃檀功德佛|138245623,021-382349||2|孙悟空|斗战胜佛|159384292,022-483432,+86-392432||3|猪八戒|净坛使者|183208243,055-8234234||4|沙僧|金身罗汉|293842295,098-2383429||5|NULL|白龙马|993267899|+----+-----------+-----------------+---------------------------------+user2表,悟空的朋友圈+----+--------------+-----------+|id|user_name|comment|+----+--------------+-----------+|1|孙悟空|美猴王||2|牛魔王|牛哥||3|铁扇公主|牛夫人||4|菩提老祖|葡萄||5|NULL|晶晶|+----+--------------+-----------+user1_kills表,取经路上杀的妖怪数量+----+-----------+---------------------+-------+|id|user_name|timestr|kills|+----+-----------+---------------------+-------+|1|孙悟空|2013-01-1000:00:00|10||2|孙悟空|2013-02-0100:00:00|2||3|孙悟空|2013-02-0500:00:00|12||4|孙悟空|2013-02-1200:00:00|22||5|猪八戒|2013-01-1100:00:00|20||6|猪八戒|2013-02-0700:00:00|17||7|猪八戒|2013-02-0800:00:00|35||8|沙僧|2013-01-1000:00:00|3||9|沙僧|2013-01-2200:00:00|9||10|沙僧|2013-02-1100:00:00|5|+----+-----------+---------------------+-------+user1_equipment表,取经组装备+----+-----------+--------------+-----------------+-----------------+|id|user_name|arms|clothing|shoe|+----+-----------+--------------+-----------------+-----------------+|1|唐僧|九环锡杖|锦斓袈裟|僧鞋||2|孙悟空|金箍棒|梭子黄金甲|藕丝步云履||3|猪八戒|九齿钉耙|僧衣|僧鞋||4|沙僧|降妖宝杖|僧衣|僧鞋|+----+-----------+--------------+-----------------+-----------------+
1. 使用left join优化not in子句
例子:找出取经组中不属于悟空朋友圈的人
+----+-----------+-----------------+-----------------------+|id|user_name|comment|mobile|+----+-----------+-----------------+-----------------------+|1|唐僧|旃檀功德佛|138245623,021-382349||3|猪八戒|净坛使者|183208243,055-8234234||4|沙僧|金身罗汉|293842295,098-2383429|+----+-----------+-----------------+-----------------------+
not in写法:
select*fromuser1awherea.user_namenotin(selectuser_namefromuser2whereuser_nameisnotnull);
left join写法:
首先看通过user_name进行连接的外连接数据集
selecta.*,b.*fromuser1aleftjoinuser2bon(a.user_name=b.user_name);
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+|id|user_name|comment|mobile|id|user_name|comment|+----+-----------+-----------------+---------------------------------+------+-----------+-----------+|2|孙悟空|斗战胜佛|159384292,022-483432,+86-392432|1|孙悟空|美猴王||1|唐僧|旃檀功德佛|138245623,021-382349|NULL|NULL|NULL||3|猪八戒|净坛使者|183208243,055-8234234|NULL|NULL|NULL||4|沙僧|金身罗汉|293842295,098-2383429|NULL|NULL|NULL||5|NULL|白龙马|993267899|NULL|NULL|NULL|+----+-----------+-----------------+---------------------------------+------+-----------+-----------+
可以看到a表中的所有数据都有显示,b表中的数据只有b.user_name与a.user_name相等才显示,其余都以null值填充,要想找出取经组中不属于悟空朋友圈的人,只需要在b.user_name中加一个过滤条件b.user_name is null即可。
selecta.*fromuser1aleftjoinuser2bon(a.user_name=b.user_name)whereb.user_nameisnull;
+----+-----------+-----------------+-----------------------+|id|user_name|comment|mobile|+----+-----------+-----------------+-----------------------+|1|唐僧|旃檀功德佛|138245623,021-382349||3|猪八戒|净坛使者|183208243,055-8234234||4|沙僧|金身罗汉|293842295,098-2383429||5|NULL|白龙马|993267899|+----+-----------+-----------------+-----------------------+
看到这里发现结果集中还多了一个白龙马,继续添加过滤条件a.user_name is not null即可。
selecta.*fromuser1aleftjoinuser2bon(a.user_name=b.user_name)whereb.user_nameisnullanda.user_nameisnotnull;
2. 使用left join优化标量子查询
例子:查看取经组中的人在悟空朋友圈的昵称
+-----------+-----------------+-----------+|user_name|comment|comment2|+-----------+-----------------+-----------+|唐僧|旃檀功德佛|NULL||孙悟空|斗战胜佛|美猴王||猪八戒|净坛使者|NULL||沙僧|金身罗汉|NULL||NULL|白龙马|NULL|+-----------+-----------------+-----------+
子查询写法:
selecta.user_name,a.comment,(selectcommentfromuser2bwhereb.user_name=a.user_name)comment2fromuser1a;
left join写法:
selecta.user_name,a.comment,b.commentcomment2fromuser1aleftjoinuser2bon(a.user_name=b.user_name);
3. 使用join优化聚合子查询
例子:查询出取经组中每人打怪最多的日期
+----+-----------+---------------------+-------+|id|user_name|timestr|kills|+----+-----------+---------------------+-------+|4|孙悟空|2013-02-1200:00:00|22||7|猪八戒|2013-02-0800:00:00|35||9|沙僧|2013-01-2200:00:00|9|+----+-----------+---------------------+-------+
聚合子查询写法:
select*fromuser1_killsawherea.kills=(selectmax(b.kills)fromuser1_killsbwhereb.user_name=a.user_name);
join写法:
首先看两表自关联的结果集,为节省篇幅,只取猪八戒的打怪数据来看
selecta.*,b.*fromuser1_killsajoinuser1_killsbon(a.user_name=b.user_name)orderby1;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+|id|user_name|timestr|kills|id|user_name|timestr|kills|+----+-----------+---------------------+-------+----+-----------+---------------------+-------+|5|猪八戒|2013-01-1100:00:00|20|5|猪八戒|2013-01-1100:00:00|20||5|猪八戒|2013-01-1100:00:00|20|6|猪八戒|2013-02-0700:00:00|17||5|猪八戒|2013-01-1100:00:00|20|7|猪八戒|2013-02-0800:00:00|35||6|猪八戒|2013-02-0700:00:00|17|7|猪八戒|2013-02-0800:00:00|35||6|猪八戒|2013-02-0700:00:00|17|5|猪八戒|2013-01-1100:00:00|20||6|猪八戒|2013-02-0700:00:00|17|6|猪八戒|2013-02-0700:00:00|17||7|猪八戒|2013-02-0800:00:00|35|5|猪八戒|2013-01-1100:00:00|20||7|猪八戒|2013-02-0800:00:00|35|6|猪八戒|2013-02-0700:00:00|17||7|猪八戒|2013-02-0800:00:00|35|7|猪八戒|2013-02-0800:00:00|35|+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
可以看到当两表通过user_name进行自关联,只需要对a表的所有字段进行一个group by,取b表中的max(kills),只要a.kills=max(b.kills)就满足要求了。sql如下
selecta.*fromuser1_killsajoinuser1_killsbon(a.user_name=b.user_name)groupbya.id,a.user_name,a.timestr,a.killshavinga.kills=max(b.kills);
4. 使用join进行分组选择
例子:对第3个例子进行升级,查询出取经组中每人打怪最多的前两个日期
+----+-----------+---------------------+-------+|id|user_name|timestr|kills|+----+-----------+---------------------+-------+|3|孙悟空|2013-02-0500:00:00|12||4|孙悟空|2013-02-1200:00:00|22||5|猪八戒|2013-01-1100:00:00|20||7|猪八戒|2013-02-0800:00:00|35||9|沙僧|2013-01-2200:00:00|9||10|沙僧|2013-02-1100:00:00|5|+----+-----------+---------------------+-------+
在oracle中,可以通过分析函数来实现
selectb.*from(selecta.*,row_number()over(partitionbyuser_nameorderbykillsdesc)cntfromuser1_killsa)bwhereb.cnt<=2;
很遗憾,上面sql在mysql中报错ERROR 1064 (42000): You have an error in your SQL syntax; 因为mysql并不支持分析函数。不过可以通过下面的方式去实现。
首先对两表进行自关联,为了节约篇幅,只取出孙悟空的数据
selecta.*,b.*fromuser1_killsajoinuser1_killsbon(a.user_name=b.user_nameanda.kills<=b.kills)orderbya.user_name,a.killsdesc;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+|id|user_name|timestr|kills|id|user_name|timestr|kills|+----+-----------+---------------------+-------+----+-----------+---------------------+-------+|4|孙悟空|2013-02-1200:00:00|22|4|孙悟空|2013-02-1200:00:00|22||3|孙悟空|2013-02-0500:00:00|12|3|孙悟空|2013-02-0500:00:00|12||3|孙悟空|2013-02-0500:00:00|12|4|孙悟空|2013-02-1200:00:00|22||1|孙悟空|2013-01-1000:00:00|10|1|孙悟空|2013-01-1000:00:00|10||1|孙悟空|2013-01-1000:00:00|10|3|孙悟空|2013-02-0500:00:00|12||1|孙悟空|2013-01-1000:00:00|10|4|孙悟空|2013-02-1200:00:00|22||2|孙悟空|2013-02-0100:00:00|2|1|孙悟空|2013-01-1000:00:00|10||2|孙悟空|2013-02-0100:00:00|2|3|孙悟空|2013-02-0500:00:00|12||2|孙悟空|2013-02-0100:00:00|2|4|孙悟空|2013-02-1200:00:00|22||2|孙悟空|2013-02-0100:00:00|2|2|孙悟空|2013-02-0100:00:00|2|+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
从上面的表中我们知道孙悟空打怪前两名的数量是22和12,那么只需要对a表的所有字段进行一个group by,对b表的id做个count,count值小于等于2就满足要求,sql改写如下:
selecta.*fromuser1_killsajoinuser1_killsbon(a.user_name=b.user_nameanda.kills<=b.kills)groupbya.id,a.user_name,a.timestr,a.killshavingcount(b.id)<=2;
5. 使用笛卡尔积关联实现一列转多行
例子:将取经组中每个电话号码变成一行
原始数据:
+-----------+---------------------------------+|user_name|mobile|+-----------+---------------------------------+|唐僧|138245623,021-382349||孙悟空|159384292,022-483432,+86-392432||猪八戒|183208243,055-8234234||沙僧|293842295,098-2383429||NULL|993267899|+-----------+---------------------------------+
想要得到的数据:
+-----------+-------------+|user_name|mobile|+-----------+-------------+|唐僧|138245623||唐僧|021-382349||孙悟空|159384292||孙悟空|022-483432||孙悟空|+86-392432||猪八戒|183208243||猪八戒|055-8234234||沙僧|293842295||沙僧|098-2383429||NULL|993267899|+-----------+-------------+
可以看到唐僧有两个电话,因此他就需要两行。我们可以先求出每人的电话号码数量,然后与一张序列表进行笛卡儿积关联,为了节约篇幅,只取出唐僧的数据
selecta.id,b.*fromtb_sequenceacrossjoin(selectuser_name,mobile,length(mobile)-length(replace(mobile,',',''))+1sizefromuser1)borderby2,1;
+----+-----------+---------------------------------+------+|id|user_name|mobile|size|+----+-----------+---------------------------------+------+|1|唐僧|138245623,021-382349|2||2|唐僧|138245623,021-382349|2||3|唐僧|138245623,021-382349|2||4|唐僧|138245623,021-382349|2||5|唐僧|138245623,021-382349|2||6|唐僧|138245623,021-382349|2||7|唐僧|138245623,021-382349|2||8|唐僧|138245623,021-382349|2||9|唐僧|138245623,021-382349|2||10|唐僧|138245623,021-382349|2|+----+-----------+---------------------------------+------+
a.id对应的就是第几个电话号码,size就是总的电话号码数量,因此可以加上关联条件(a.id <= b.size),将上面的sql继续调整
selectb.user_name,replace(substring(substring_index(b.mobile,',',a.id),char_length(substring_index(mobile,',',a.id-1))+1),',','')asmobilefromtb_sequenceacrossjoin(selectuser_name,concat(mobile,',')asmobile,length(mobile)-length(replace(mobile,',',''))+1sizefromuser1)bon(a.id<=b.size);
6. 使用笛卡尔积关联实现多列转多行
例子:将取经组中每件装备变成一行
原始数据:
+----+-----------+--------------+-----------------+-----------------+|id|user_name|arms|clothing|shoe|+----+-----------+--------------+-----------------+-----------------+|1|唐僧|九环锡杖|锦斓袈裟|僧鞋||2|孙悟空|金箍棒|梭子黄金甲|藕丝步云履||3|猪八戒|九齿钉耙|僧衣|僧鞋||4|沙僧|降妖宝杖|僧衣|僧鞋|+----+-----------+--------------+-----------------+-----------------+
想要得到的数据:
+-----------+-----------+-----------------+|user_name|equipment|equip_mame|+-----------+-----------+-----------------+|唐僧|arms|九环锡杖||唐僧|clothing|锦斓袈裟||唐僧|shoe|僧鞋||孙悟空|arms|金箍棒||孙悟空|clothing|梭子黄金甲||孙悟空|shoe|藕丝步云履||沙僧|arms|降妖宝杖||沙僧|clothing|僧衣||沙僧|shoe|僧鞋||猪八戒|arms|九齿钉耙||猪八戒|clothing|僧衣||猪八戒|shoe|僧鞋|+-----------+-----------+-----------------+
union的写法:
selectuser_name,'arms'asequipment,armsequip_mamefromuser1_equipmentunionallselectuser_name,'clothing'asequipment,clothingequip_mamefromuser1_equipmentunionallselectuser_name,'shoe'asequipment,shoeequip_mamefromuser1_equipmentorderby1,2;
join的写法:
首先看笛卡尔数据集的效果,以唐僧为例
selecta.*,b.*fromuser1_equipmentacrossjointb_sequencebwhereb.id<=3;
+----+-----------+--------------+-----------------+-----------------+----+|id|user_name|arms|clothing|shoe|id|+----+-----------+--------------+-----------------+-----------------+----+|1|唐僧|九环锡杖|锦斓袈裟|僧鞋|1||1|唐僧|九环锡杖|锦斓袈裟|僧鞋|2||1|唐僧|九环锡杖|锦斓袈裟|僧鞋|3|+----+-----------+--------------+-----------------+-----------------+----+
使用case对上面的结果进行处理
selectuser_name,casewhenb.id=1then'arms'whenb.id=2then'clothing'whenb.id=3then'shoe'endasequipment,casewhenb.id=1thenarmsendarms,casewhenb.id=2thenclothingendclothing,casewhenb.id=3thenshoeendshoefromuser1_equipmentacrossjointb_sequencebwhereb.id<=3;
+-----------+-----------+--------------+-----------------+-----------------+|user_name|equipment|arms|clothing|shoe|+-----------+-----------+--------------+-----------------+-----------------+|唐僧|arms|九环锡杖|NULL|NULL||唐僧|clothing|NULL|锦斓袈裟|NULL||唐僧|shoe|NULL|NULL|僧鞋|+-----------+-----------+--------------+-----------------+-----------------+
使用coalesce函数将多列数据进行合并
selectuser_name,casewhenb.id=1then'arms'whenb.id=2then'clothing'whenb.id=3then'shoe'endasequipment,coalesce(casewhenb.id=1thenarmsend,casewhenb.id=2thenclothingend,casewhenb.id=3thenshoeend)equip_mamefromuser1_equipmentacrossjointb_sequencebwhereb.id<=3orderby1,2;
7. 使用join更新过滤条件中包含自身的表
例子:把同时存在于取经组和悟空朋友圈中的人,在取经组中把comment字段更新为"此人在悟空的朋友圈"
我们很自然地想到先查出user1和user2中user_name都存在的人,然后更新user1表,sql如下
updateuser1setcomment='此人在悟空的朋友圈'whereuser_namein(selecta.user_namefromuser1ajoinuser2bon(a.user_name=b.user_name));
很遗憾,上面sql在mysql中报错:ERROR 1093 (HY000): You can't specify target table 'user1' for update in FROM clause,提示不能更新目标表在from子句的表。
那有没有其它办法呢?我们可以将in的写法转换成join的方式
selectc.*,d.*fromuser1cjoin(selecta.user_namefromuser1ajoinuser2bon(a.user_name=b.user_name))don(c.user_name=d.user_name);
+----+-----------+--------------+---------------------------------+-----------+|id|user_name|comment|mobile|user_name|+----+-----------+--------------+---------------------------------+-----------+|2|孙悟空|斗战胜佛|159384292,022-483432,+86-392432|孙悟空|+----+-----------+--------------+---------------------------------+-----------+
然后对join之后的视图进行更新即可
updateuser1cjoin(selecta.user_namefromuser1ajoinuser2bon(a.user_name=b.user_name))don(c.user_name=d.user_name)setc.comment='此人在悟空的朋友圈';
再查看user1,可以看到user1已修改成功
select*fromuser1;
+----+-----------+-----------------------------+---------------------------------+|id|user_name|comment|mobile|+----+-----------+-----------------------------+---------------------------------+|1|唐僧|旃檀功德佛|138245623,021-382349||2|孙悟空|此人在悟空的朋友圈|159384292,022-483432,+86-392432||3|猪八戒|净坛使者|183208243,055-8234234||4|沙僧|金身罗汉|293842295,098-2383429||5|NULL|白龙马|993267899|+----+-----------+-----------------------------+---------------------------------+
8. 使用join删除重复数据
首先向user2表中插入两条数据
insertintouser2(user_name,comment)values('孙悟空','美猴王');insertintouser2(user_name,comment)values('牛魔王','牛哥');
例子:将user2表中的重复数据删除,只保留id号大的
+----+--------------+-----------+|id|user_name|comment|+----+--------------+-----------+|1|孙悟空|美猴王||2|牛魔王|牛哥||3|铁扇公主|牛夫人||4|菩提老祖|葡萄||5|NULL|晶晶||6|孙悟空|美猴王||7|牛魔王|牛哥|+----+--------------+-----------+
首先查看重复记录
selecta.*,b.*fromuser2ajoin(selectuser_name,comment,max(id)idfromuser2groupbyuser_name,commenthavingcount(*)>1)bon(a.user_name=b.user_nameanda.comment=b.comment)orderby2;
+----+-----------+-----------+-----------+-----------+------+|id|user_name|comment|user_name|comment|id|+----+-----------+-----------+-----------+-----------+------+|1|孙悟空|美猴王|孙悟空|美猴王|6||6|孙悟空|美猴王|孙悟空|美猴王|6||2|牛魔王|牛哥|牛魔王|牛哥|7||7|牛魔王|牛哥|牛魔王|牛哥|7|+----+-----------+-----------+-----------+-----------+------+
接着只需要删除(a.id < b.id)的数据即可
deleteafromuser2ajoin(selectuser_name,comment,max(id)idfromuser2groupbyuser_name,commenthavingcount(*)>1)bon(a.user_name=b.user_nameanda.comment=b.comment)wherea.id<b.id;
查看user2,可以看到重复数据已经被删掉了
select*fromuser2;
+----+--------------+-----------+|id|user_name|comment|+----+--------------+-----------+|3|铁扇公主|牛夫人||4|菩提老祖|葡萄||5|NULL|晶晶||6|孙悟空|美猴王||7|牛魔王|牛哥|+----+--------------+-----------+
关于怎么在Mysql中利用join优化sql问题的解答就分享到这里了,希望以上内容可以对大家有一定的帮助,如果你还有很多疑惑没有解开,可以关注亿速云行业资讯频道了解更多相关知识。
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。