这篇文章主要介绍“PostgreSQL中remove_useless_joins的实现逻辑是怎样的”,在日常操作中,相信很多人在PostgreSQL中remove_useless_joins的实现逻辑是怎样的问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”PostgreSQL中remove_useless_joins的实现逻辑是怎样的”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!

query_planner代码片段:

//.../**Removeanyuselessouterjoins.Ideallythiswouldbedoneduring*jointreepreprocessing,butthenecessaryinformationisn'tavailable*untilwe'vebuiltbasereldatastructuresandclassifiedqualclauses.*/joinlist=remove_useless_joins(root,joinlist);//清除无用的外连接/**Also,reduceanysemijoinswithuniqueinnerrelstoplaininnerjoins.*Likewise,thiscan'tbedoneuntilnowforlackofneededinfo.*/reduce_unique_semijoins(root);//消除半连接/**Nowdistribute"placeholders"tobaserelsasneeded.Thishastobe*doneafterjoinremovalbecauseremovalcouldchangewhethera*placeholderisevaluableatabaserel.*/add_placeholders_to_base_rels(root);//在"baserels"中添加PH//...一、数据结构

PlaceHolderVar
上一小节已介绍过PHInfo

/**Placeholdernodeforanexpressiontobeevaluatedbelowthetoplevel*ofaplantree.Thisisusedduringplanningtorepresentthecontained*expression.Attheendoftheplanningprocessitisreplacedbyeither*thecontainedexpressionoraVarreferringtoalower-levelevaluationof*thecontainedexpression.Typicallytheevaluationoccursbelowanouter*join,andVarreferencesabovetheouterjoinmighttherebyyieldNULL*insteadoftheexpressionvalue.**Althoughtheplannertreatsthisasanexpressionnodetype,itisnot*recognizedbytheparserorexecutor,sowedeclareithereratherthan*inprimnodes.h.*/typedefstructPlaceHolderVar{Exprxpr;Expr*phexpr;/*therepresentedexpression*/Relidsphrels;/*baserelidssyntacticallywithinexprsrc*/Indexphid;/*IDforPHV(uniquewithinplannerrun)*/Indexphlevelsup;/*>0ifPHVbelongstoouterquery*/}PlaceHolderVar;

SpecialJoinInfo

/**"Specialjoin"info.**One-sidedouterjoinsconstraintheorderofjoiningpartiallybutnot*completely.Weflattensuchjoinsintotheplanner'stop-levellistof*relationstojoin,butrecordinformationabouteachouterjoinina*SpecialJoinInfostruct.ThesestructsarekeptinthePlannerInfonode's*join_info_list.**Similarly,semijoinsandantijoinscreatedbyflatteningIN(subselect)*andEXISTS(subselect)clausescreatepartialconstraintsonjoinorder.*ThesearelikewiserecordedinSpecialJoinInfostructs.**WemakeSpecialJoinInfosforFULLJOINseventhoughthereisnoflexibility*ofplanningforthem,becausethissimplifiesmake_join_rel()'sAPI.**min_lefthandandmin_righthandarethesetsofbaserelidsthatmustbe*availableoneachsidewhenperformingthespecialjoin.lhs_strictis*trueifthespecialjoin'sconditioncannotsucceedwhentheLHSvariables*areallNULL(thismeansthatanouterjoincancommutewithupper-level*outerjoinsevenifitappearsintheirRHS).Wedon'tbothertoset*lhs_strictforFULLJOINs,however.**Itisnotvalidforeithermin_lefthandormin_righthandtobeemptysets;*iftheywere,thiswouldbreakthelogicthatenforcesjoinorder.**syn_lefthandandsyn_righthandarethesetsofbaserelidsthatare*syntacticallybelowthisspecialjoin.(Theseareneededtohelpcompute*min_lefthandandmin_righthandforhigherjoins.)**delay_upper_joinsissettrueifwedetectapushed-downclausethathas*tobeevaluatedafterthisjoinisformed(becauseitreferencestheRHS).*AnyouterjoinsthathavesuchaclauseandthisjoinintheirRHScannot*commutewiththisjoin,becausethatwouldleavenoplacetocheckthe*pushed-downclause.(Wedon'ttrackthisforFULLJOINs,either.)**Forasemijoin,wealsoextractthejoinoperatorsandtheirRHSarguments*andsetsemi_operators,semi_rhs_exprs,semi_can_btree,andsemi_can_hash.*Thisisdoneinsupportofpossiblyunique-ifyingtheRHS,sowedon't*botherunlessatleastoneofsemi_can_btreeandsemi_can_hashcanbeset*true.(Youmightexpectthatthisinformationwouldbecomputedduring*joinplanning;butit'shelpfultohaveitavailableduringplanningof*parameterizedtablescans,sowestoreitintheSpecialJoinInfostructs.)**jointypeisneverJOIN_RIGHT;aRIGHTJOINishandledbyswitching*theinputstomakeitaLEFTJOIN.Sotheallowedvaluesofjointype*inajoin_info_listmemberareonlyLEFT,FULL,SEMI,orANTI.**Forpurposesofjoinselectivityestimation,wecreatetransient*SpecialJoinInfostructuresforregularinnerjoins;soitispossible*tohavejointype==JOIN_INNERinsuchastructure,eventhoughthisis*notallowedwithinjoin_info_list.Wealsocreatetransient*SpecialJoinInfoswithjointype==JOIN_INNERforouterjoins,sincefor*costestimationpurposesitissometimesusefultoknowthejoinsizeunder*plaininnerjoinsemantics.Notethatlhs_strict,delay_upper_joins,and*ofcoursethesemi_xxxfieldsarenotsetmeaningfullywithinsuchstructs.*/typedefstructSpecialJoinInfo{NodeTagtype;Relidsmin_lefthand;/*baserelidsinminimumLHSforjoin*/Relidsmin_righthand;/*baserelidsinminimumRHSforjoin*/Relidssyn_lefthand;/*baserelidssyntacticallywithinLHS*/Relidssyn_righthand;/*baserelidssyntacticallywithinRHS*/JoinTypejointype;/*alwaysINNER,LEFT,FULL,SEMI,orANTI*/boollhs_strict;/*joinclauseisstrictforsomeLHSrel*/booldelay_upper_joins;/*can'tcommutewithupperRHS*//*RemainingfieldsaresetonlyforJOIN_SEMIjointype:*/boolsemi_can_btree;/*trueifsemi_operatorsareallbtree*/boolsemi_can_hash;/*trueifsemi_operatorsareallhash*/List*semi_operators;/*OIDsofequalityjoinoperators*/List*semi_rhs_exprs;/*righthand-sideexpressionsoftheseops*/}SpecialJoinInfo;二、源码解读

remove_useless_joins
清除无用的连接,比如以下的SQL语句:

selectt1.dwbhfromt_grxxt1leftjoint_dwxxt2ont1.dwbh=t2.dwbh;

左连接,而且t_dwxx.dwbh唯一,这样的连接是不需要的连接,直接查询t_grxx即可.
从执行计划来看,PG只对t_grxx进行扫描:

testdb=#explainverboseselectt1.dwbhfromt_grxxt1leftjoint_dwxxt2ont1.dwbh=t2.dwbh;QUERYPLAN--------------------------------------------------------------------SeqScanonpublic.t_grxxt1(cost=0.00..14.00rows=400width=38)Output:t1.dwbh(2rows)

源代码如下:

/**remove_useless_joins*Checkforrelationsthatdon'tactuallyneedtobejoinedatall,*andremovethemfromthequery.**Wearepassedthecurrentjoinlistandreturntheupdatedlist.Other*datastructuresthathavetobeupdatedareaccessiblevia"root".*/List*remove_useless_joins(PlannerInfo*root,List*joinlist){ListCell*lc;/**Weareonlyinterestedinrelationsthatareleft-joinedto,sowecan*scanthejoin_info_listtofindthemeasily.*/restart:foreach(lc,root->join_info_list)//遍历连接信息链表{SpecialJoinInfo*sjinfo=(SpecialJoinInfo*)lfirst(lc);intinnerrelid;intnremoved;/*Skipifnotremovable*/if(!join_is_removable(root,sjinfo))//判断是否可以清除连接continue;/**Currently,join_is_removablecanonlysucceedwhenthesjinfo's*righthandisasinglebaserel.Removethatrelfromthequeryand*joinlist.*/innerrelid=bms_singleton_member(sjinfo->min_righthand);remove_rel_from_query(root,innerrelid,bms_union(sjinfo->min_lefthand,sjinfo->min_righthand));//从查询中删除相应的Rel/*Weverifythatexactlyonereferencegetsremovedfromjoinlist*/nremoved=0;joinlist=remove_rel_from_joinlist(joinlist,innerrelid,&nremoved);if(nremoved!=1)elog(ERROR,"failedtofindrelation%dinjoinlist",innerrelid);/**WecandeletethisSpecialJoinInfofromthelisttoo,sinceit'sno*longerofinterest.*///更新连接链表信息root->join_info_list=list_delete_ptr(root->join_info_list,sjinfo);/**Restartthescan.Thisisnecessarytoensurewefindall*removablejoinsindependentlyoforderingofthejoin_info_list*(notethatremovalofattr_neededbitsmaymakeajoinappear*removablethatdidnotbefore).Also,sincewejustdeletedthe*currentlistcell,we'dhavetohavesomeklugetocontinuethe*listscananyway.*/gotorestart;}returnjoinlist;}

reduce_unique_semijoins
把可以简化的半连接转化为内连接.
比如以下的SQL语句:

selectt1.*fromt_grxxt1wheredwbhIN(selectt2.dwbhfromt_dwxxt2);

由于子查询"select t2.dwbh from t_dwxx t2"的dwbh是PK,子查询提升后,t_grxx的dwbh只对应t_dwxx唯一的一条记录,因此可以把半连接转换为内连接,执行计划如下:

testdb=#explainverboseselectt1.*fromt_grxxt1wheredwbhIN(selectt2.dwbhfromt_dwxxt2);QUERYPLAN-----------------------------------------------------------------------------HashJoin(cost=1.07..20.10rows=6width=176)Output:t1.dwbh,t1.grbh,t1.xm,t1.xb,t1.nlInnerUnique:trueHashCond:((t1.dwbh)::text=(t2.dwbh)::text)->SeqScanonpublic.t_grxxt1(cost=0.00..14.00rows=400width=176)Output:t1.dwbh,t1.grbh,t1.xm,t1.xb,t1.nl->Hash(cost=1.03..1.03rows=3width=38)Output:t2.dwbh->SeqScanonpublic.t_dwxxt2(cost=0.00..1.03rows=3width=38)Output:t2.dwbh(10rows)

跟踪分析:

(gdb)n199reduce_unique_semijoins(root);(gdb)stepreduce_unique_semijoins(root=0x1702968)atanalyzejoins.c:520520for(lc=list_head(root->join_info_list);lc!=NULL;lc=next)(gdb)n522SpecialJoinInfo*sjinfo=(SpecialJoinInfo*)lfirst(lc);(gdb)

查看SpecialJoinInfo内存结构:

528next=lnext(lc);(gdb)p*sjinfo$1={type=T_SpecialJoinInfo,min_lefthand=0x1749818,min_righthand=0x1749830,syn_lefthand=0x1749570,syn_righthand=0x17495d0,jointype=JOIN_SEMI,lhs_strict=true,delay_upper_joins=false,semi_can_btree=true,semi_can_hash=true,semi_operators=0x17496c8,semi_rhs_exprs=0x17497b8}

内表(innerrel,即t_dwxx)如支持唯一性,则可以考虑把半连接转换为内连接

550if(!rel_supports_distinctness(root,innerrel))...575root->join_info_list=list_delete_ptr(root->join_info_list,sjinfo);...

源代码如下:

/**reduce_unique_semijoins*Checkforsemijoinsthatcanbesimplifiedtoplaininnerjoins*becausetheinnerrelationisprovablyuniqueforthejoinclauses.**Ideallythiswouldhappenduringreduce_outer_joins,butwedon'thave*enoughinformationatthatpoint.**Toperformthestrengthreductionwhenapplicable,weneedonlydelete*thesemijoin'sSpecialJoinInfofromroot->join_info_list.(Wedon't*botherfixingthejointypeattributedtoitinthequeryjointree,*sincethatwon'tbeconsultedagain.)*/voidreduce_unique_semijoins(PlannerInfo*root){ListCell*lc;ListCell*next;/**Scanthejoin_info_listtofindsemijoins.Wecan'tuseforeach*becausewemaydeletethecurrentcell.*/for(lc=list_head(root->join_info_list);lc!=NULL;lc=next){SpecialJoinInfo*sjinfo=(SpecialJoinInfo*)lfirst(lc);//特殊连接信息,先前通过deconstruct函数生成intinnerrelid;RelOptInfo*innerrel;Relidsjoinrelids;List*restrictlist;next=lnext(lc);/**Mustbeanon-delayingsemijointoasinglebaserel,elsewearen't*goingtobeabletodoanythingwithit.(It'sprobablynot*possiblefordelay_upper_joinstobesetonasemijoin,butwe*mightaswellcheck.)*/if(sjinfo->jointype!=JOIN_SEMI||sjinfo->delay_upper_joins)continue;if(!bms_get_singleton_member(sjinfo->min_righthand,&innerrelid))continue;innerrel=find_base_rel(root,innerrelid);/**Beforewetroubletorungenerate_join_implied_equalities,makea*quickchecktoeliminatecasesinwhichwewillsurelybeunableto*proveuniquenessoftheinnerrel.*/if(!rel_supports_distinctness(root,innerrel))continue;/*Computetherelidsetforthejoinweareconsidering*/joinrelids=bms_union(sjinfo->min_lefthand,sjinfo->min_righthand);/**Sincewe'reonlyconsideringasingle-relRHS,anyjoinclausesit*hasmustbeclauseslinkingittothesemijoin'smin_lefthand.We*canalsoconsiderEC-derivedjoinclauses.*/restrictlist=list_concat(generate_join_implied_equalities(root,joinrelids,sjinfo->min_lefthand,innerrel),innerrel->joininfo);/*Testwhethertheinnerrelisuniqueforthoseclauses.*/if(!innerrel_is_unique(root,joinrelids,sjinfo->min_lefthand,innerrel,JOIN_SEMI,restrictlist,true))continue;/*OK,removetheSpecialJoinInfofromthelist.*/root->join_info_list=list_delete_ptr(root->join_info_list,sjinfo);//删除特殊连接信息}}

add_placeholders_to_base_rels
把PHV分发到base rels中,代码较为简单

/**add_placeholders_to_base_rels*AddanyrequiredPlaceHolderVarstobaserels'targetlists.**Ifanyplaceholdercanbecomputedatabaserelandisneededaboveit,*addittothatrel'stargetlist.Thismightlooklikeitcouldbemerged*withfix_placeholder_input_needed_levels,butitmustbeseparatebecause*joinremovalhappensinbetween,andcanchangetheph_eval_atsets.There*isessentiallythesamelogicinadd_placeholders_to_joinrel,butwecan't*dothatpartuntiljoinrelsareformed.*/voidadd_placeholders_to_base_rels(PlannerInfo*root){ListCell*lc;foreach(lc,root->placeholder_list)//遍历PH链表{PlaceHolderInfo*phinfo=(PlaceHolderInfo*)lfirst(lc);Relidseval_at=phinfo->ph_eval_at;intvarno;if(bms_get_singleton_member(eval_at,&varno)&&bms_nonempty_difference(phinfo->ph_needed,eval_at))//添加到需要的RelOptInfo中{RelOptInfo*rel=find_base_rel(root,varno);rel->reltarget->exprs=lappend(rel->reltarget->exprs,copyObject(phinfo->ph_var));/*reltarget'scostandwidthfieldswillbeupdatedlater*/}}}

到此,关于“PostgreSQL中remove_useless_joins的实现逻辑是怎样的”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注亿速云网站,小编会继续努力为大家带来更多实用的文章!