怎么理解PostgreSQL全表扫描问题
这篇文章主要讲解了“怎么理解PostgreSQL全表扫描问题”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习“怎么理解PostgreSQL全表扫描问题”吧!
本节内容来源于PGer的一个问题:
Q:
由于多版本的存在,那么全表扫描是不是需要更长的时间了呢?
A:
关于全表扫描,不妨考虑2种极端的情况:
1.insert数据(事务已提交,下同),没有执行update/delete,没有dead tuple,全表扫描效率没有影响;
2.insert数据,执行了大量的update/delete,同时禁用了autovacuum也没有手工执行vacuum,那么存在大量的dead tuple,性能上一是需要更多的IO操作,二是需要执行额外的CPU判断(对于所有的tuple都要执行可见性判断).
其判断逻辑如下:
((Xmin==my-transaction&& insertedbythecurrenttransaction Cmin<my-command&& beforethiscommand,and (Xmaxisnull|| therowhasnotbeendeleted,or (Xmax==my-transaction&& itwasdeletedbythecurrenttransaction Cmax>=my-command))) butnotbeforethiscommand,|| or (Xminiscommitted&& therowwasinsertedbyacommittedtransaction,and (Xmaxisnull|| therowhasnotbeendeleted,or (Xmax==my-transaction&& therowisbeingdeletedbythistransaction Cmax>=my-command)|| butit’snotdeleted"yet",or (Xmax!=my-transaction&& therowwasdeletedbyanothertransaction Xmaxisnotcommitted)))) thathasnotbeencommitted
简单做个实验,创建一张表t_fts,
1.插入数据,大小为s1,执行全表扫描,时间为m秒;
2.update所有行,大小为s2,执行全表扫描,时间为n秒.
理论上来说,n应为m的s2/s1倍左右(相对于IO时间,如果tuple数不多,CPU时间可以忽略不计).
创建数据表,插入数据:
testdb=#droptableifexistst_fts;DROPTABLEtestdb=#createtablet_fts(idint,c1varchar(200),c2varchar(200));CREATETABLEtestdb=#testdb=#insertintot_ftsselectx,lpad('c1'||x,200,'x'),lpad('c1'||x,200,'x')fromgenerate_series(1,2000000)asx;INSERT02000000testdb=#selectpg_size_pretty(pg_table_size('t_fts'));pg_size_pretty----------------868MB(1row)
禁用autovacuum,执行查询:
testdb=#altersystemsetautovacuum=off;ALTERSYSTEMtestdb=#showautovacuum;autovacuum------------off(1row)testdb=#explainanalyzeverboseselect*fromt_fts;QUERYPLAN----------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.t_fts(cost=0.00..131112.16rows=2000016width=412)(actualtime=0.048..1086.289rows=2000000loops=1)Output:id,c1,c2PlanningTime:30.762msExecutionTime:1181.360ms(4rows)
执行update:
testdb=#updatet_ftssetc1=lpad('c1'||(id+1),200,id+1||''),c2=lpad('c1'||(id+1),200,id+1||'');UPDATE2000000testdb=#selectpg_size_pretty(pg_table_size('t_fts'));pg_size_pretty----------------1737MB(1row)testdb=#explainanalyzeverboseselect*fromt_fts;QUERYPLAN-------------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.t_fts(cost=0.00..262223.14rows=4000014width=412)(actualtime=3168.414..6117.780rows=2000000loops=1)Output:id,c1,c2PlanningTime:5.493msExecutionTime:6205.705ms(4rows)testdb=#explainanalyzeverboseselect*fromt_fts;QUERYPLAN------------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.t_fts(cost=0.00..262223.14rows=4000014width=412)(actualtime=776.660..2311.270rows=2000000loops=1)Output:id,c1,c2PlanningTime:0.426msExecutionTime:2391.895ms(4rows)testdb=#explainanalyzeverboseselect*fromt_fts;QUERYPLAN------------------------------------------------------------------------------------------------------------------------------SeqScanonpublic.t_fts(cost=0.00..262223.14rows=4000014width=412)(actualtime=728.758..2293.157rows=2000000loops=1)Output:id,c1,c2PlanningTime:0.481msExecutionTime:2373.241ms(4rows)
感谢各位的阅读,以上就是“怎么理解PostgreSQL全表扫描问题”的内容了,经过本文的学习后,相信大家对怎么理解PostgreSQL全表扫描问题这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是亿速云,小编将为大家推送更多相关知识点的文章,欢迎关注!
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。