这篇文章主要介绍“PostgreSQL怎么调用mergeruns函数”,在日常操作中,相信很多人在PostgreSQL怎么调用mergeruns函数问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答”PostgreSQL怎么调用mergeruns函数”的疑惑有所帮助!接下来,请跟着小编一起来学习吧!

TupleTableSlot
执行器在”tuple table”中存储元组,这个表是各自独立的TupleTableSlots链表.

/*----------*Theexecutorstorestuplesina"tupletable"whichisaListof*independentTupleTableSlots.Thereareseveralcasesweneedtohandle:*1.physicaltupleinadiskbufferpage*2.physicaltupleconstructedinpalloc'edmemory*3."minimal"physicaltupleconstructedinpalloc'edmemory*4."virtual"tupleconsistingofDatum/isnullarrays*执行器在"tupletable"中存储元组,这个表是各自独立的TupleTableSlots链表.*有以下情况需要处理:*1.磁盘缓存页中的物理元组*2.在已分配内存中构造的物理元组*3.在已分配内存中构造的"minimal"物理元组*4.含有Datum/isnull数组的"virtual"虚拟元组**Thefirsttwocasesaresimilarinthattheybothdealwith"materialized"*tuples,butresourcemanagementisdifferent.Foratupleinadiskpage*weneedtoholdapinonthebufferuntiltheTupleTableSlot'sreference*tothetupleisdropped;whileforapalloc'dtupleweusuallywantthe*tuplepfree'dwhentheTupleTableSlot'sreferenceisdropped.*最上面2种情况跟"物化"元组的处理方式类似,但资源管理是不同的.*对于在磁盘页中的元组,需要pin在缓存中直至TupleTableSlot依赖的元组被清除,*而对于通过palloc分配的元组在TupleTableSlot依赖被清除后通常希望使用pfree释放**A"minimal"tupleishandledsimilarlytoapalloc'dregulartuple.*Atpresent,minimaltuplesneverarestoredinbuffers,sothereisno*paralleltocase1.Notethataminimaltuplehasno"systemcolumns".*(Actually,itcouldhaveanOID,butwehavenoneedtoaccesstheOID.)*"minimal"元组与通常的palloc分配的元组处理类似.*截止目前为止,"minimal"元组不会存储在缓存中,因此对于第一种情况不会存在并行的问题.*注意"minimal"没有"systemcolumns"系统列*(实际上,可以有OID,但不需要访问OID列)**A"virtual"tupleisanoptimizationusedtominimizephysicaldata*copyinginanestofplannodes.Anypass-by-referenceDatumsinthe*tuplepointtostoragethatisnotdirectlyassociatedwiththe*TupleTableSlot;generallytheywillpointtopartofatuplestoredin*alowerplannode'soutputTupleTableSlot,ortoafunctionresult*constructedinaplannode'sper-tupleecontext.Itistheresponsibility*ofthegeneratingplannodetobesuretheseresourcesarenotreleased*foraslongasthevirtualtupleneedstobevalid.Weonlyusevirtual*tuplesintheresultslotsofplannodes---tuplestobecopiedanywhere*elseneedtobe"materialized"intophysicaltuples.Notealsothata*virtualtupledoesnothaveany"systemcolumns".*"virtual"元组是用于在嵌套计划节点中拷贝时最小化物理数据的优化.*所有通过引用传递指向与TupleTableSlot非直接相关的存储的元组的Datums使用,*通常它们会指向存储在低层节点输出的TupleTableSlot中的元组的一部分,*或者指向在计划节点的per-tuple内存上下文econtext中构造的函数结果.*产生计划节点的时候有责任确保这些资源未被释放,确保virtual元组是有效的.*我们使用计划节点中的结果slots中的虚拟元组---元组会拷贝到其他地方需要"物化"到物理元组中.*注意virtual元组不需要有"systemcolumns"**ItisalsopossibleforaTupleTableSlottoholdbothphysicalandminimal*copiesofatuple.Thisisdonewhentheslotisrequestedtoprovide*theformatotherthantheoneitcurrentlyholds.(Originallyweattempted*tohandlesuchrequestsbyreplacingoneformatwiththeother,butthat*hadthefataldefectofinvalidatinganypass-by-referenceDatumspointing*intotheexistingslotcontents.)Bothcopiesmustcontainidenticaldata*payloadswhenthisisthecase.*TupleTableSlot包含物理和minimal元组拷贝是可能的.*在slot需要提供格式化而不是当前持有的格式时会出现这种情况.*(原始的情况是我们准备通过另外一种格式进行替换来处理这种请求,但在校验引用传递Datums时会出现致命错误)*同时在这种情况下,拷贝必须含有唯一的数据payloads.**TheDatum/isnullarraysofaTupleTableSlotservedoubleduty.Whenthe*slotcontainsavirtualtuple,theyaretheauthoritativedata.Whenthe*slotcontainsaphysicaltuple,thearrayscontaindataextractedfrom*thetuple.(Inthisstate,anypass-by-referenceDatumspointinto*thephysicaltuple.)Theextractedinformationisbuilt"lazily",*ie,onlyasneeded.Thisservestoavoidrepeatedextractionofdata*fromthephysicaltuple.*TupleTableSlot中的Datum/isnull数组有双重职责.*在slot包含虚拟元组时,它们是authoritative(权威)数据.*在slot包含物理元组时,时包含从元组中提取的数据的数组.*(在这种情况下,所有通过引用传递的Datums指向物理元组)*提取的信息通过'lazily'在需要的时候才构建.*这样可以避免从物理元组的重复数据提取.**ATupleTableSlotcanalsobe"empty",holdingnovaliddata.Thisis*theonlyvalidstateforafreshly-createdslotthathasnotyethada*tupledescriptorassignedtoit.Inthisstate,tts_isemptymustbe*true,tts_shouldFreefalse,tts_tupleNULL,tts_bufferInvalidBuffer,*andtts_nvalidzero.*TupleTableSlot可能为"empty",没有有效数据.*对于新鲜创建仍未分配描述的的slot来说这是唯一有效的状态.*在这种状态下,tts_isempty必须为T,tts_shouldFree为F,tts_tuple为NULL,*tts_buffer为InvalidBuffer,tts_nvalid为0.**ThetupleDescriptorissimplyreferenced,notcopied,bytheTupleTableSlot*code.ThecallerofExecSetSlotDescriptor()isresponsibleforproviding*adescriptorthatwillliveaslongastheslotdoes.(Typically,both*slotsanddescriptorsareinper-querymemoryandarefreedbymemory*contextdeallocationatqueryend;soit'snotworthprovidinganyextra*mechanismtodomore.However,theslotwillincrementthetupdesc*referencecountifareference-countedtupdescissupplied.)*tupleDescriptor只是简单的引用并没有通过TupleTableSlot中的代码进行拷贝.*ExecSetSlotDescriptor()的调用者有责任提供与slot生命周期一样的描述符.*(典型的,不管是slots还是描述符会在per-query内存中,*并且会在查询结束时通过内存上下文的析构器释放,因此不需要提供额外的机制来处理.*但是,如果使用了引用计数型tupdesc,slot会增加tupdesc引用计数)**Whentts_shouldFreeistrue,thephysicaltupleis"owned"bytheslot*andshouldbefreedwhentheslot'sreferencetothetupleisdropped.*在tts_shouldFree为T的情况下,物理元组由slot持有,并且在slot引用元组被清除时释放内存.**Iftts_bufferisnotInvalidBuffer,thentheslotisholdingapin*ontheindicatedbufferpage;dropthepinwhenwereleasethe*slot'sreferencetothatbuffer.(tts_shouldFreeshouldalwaysbe*falseinsuchacase,sincepresumablytts_tupleispointingatthe*bufferpage.)*如tts_buffer不是InvalidBuffer,那么slot持有缓存页中的pin,在释放引用该buffer的slot时会清除该pin.*(tts_shouldFree通常来说应为F,因为tts_tuple会指向缓存页)**tts_nvalidindicatesthenumberofvalidcolumnsinthetts_values/isnull*arrays.Whentheslotisholdinga"virtual"tuplethismustbeequal*tothedescriptor'snatts.Whentheslotisholdingaphysicaltuple*thisisequaltothenumberofcolumnswehaveextracted(wealways*extractcolumnsfromlefttoright,sotherearenoholes).*tts_nvalid指示了tts_values/isnull数组中的有效列数.*如果slot含有虚拟元组,该字段必须跟描述符的natts一样.*在slot含有物理元组时,该字段等于我们提取的列数.*(我们通常从左到右提取列,因此不会有空洞存在)**tts_values/tts_isnullareallocatedwhenadescriptorisassignedtothe*slot;theyareoflengthequaltothedescriptor'snatts.*在描述符分配给slot时tts_values/tts_isnull会被分配内存,长度与描述符natts长度一样.**tts_mintuplemustalwaysbeNULLiftheslotdoesnotholda"minimal"*tuple.Whenitdoes,tts_mintuplepointstotheactualMinimalTupleData*object(thethingtobepfree'diftts_shouldFreeMinistrue).Iftheslot*hasonlyaminimalandnotalsoaregularphysicaltuple,thentts_tuple*pointsattts_minhdrandthefieldsofthatstructaresetcorrectly*foraccesstotheminimaltuple;inparticular,tts_minhdr.t_datapoints*MINIMAL_TUPLE_OFFSETbytesbeforetts_mintuple.Thisallowscolumn*extractiontotreatthecaseidenticallytoregularphysicaltuples.*如果slot没有包含minimal元组,tts_mintuple通常必须为NULL.*如含有,则tts_mintuple执行实际的MinimalTupleData对象(如tts_shouldFreeMin为T,则需要通过pfree释放内存).*如果slot只有一个minimal而没有通常的物理元组,那么tts_tuple指向tts_minhdr,*结构体的其他字段会被正确的设置为用于访问minimal元组.*特别的,tts_minhdr.t_data指向tts_mintuple前的MINIMAL_TUPLE_OFFSET字节.*这可以让列提取可以独立处理通常的物理元组.**tts_slow/tts_offaresavedstateforslot_deform_tuple,andshouldnot*betouchedbyanyothercode.*tts_slow/tts_off用于存储slot_deform_tuple状态,不应通过其他代码修改.*----------*/typedefstructTupleTableSlot{NodeTagtype;//Node标记//如slot为空,则为Tbooltts_isempty;/*true=slotisempty*///是否需要pfreetts_tuple?booltts_shouldFree;/*shouldpfreetts_tuple?*///是否需要pfreetts_mintuple?booltts_shouldFreeMin;/*shouldpfreetts_mintuple?*/#defineFIELDNO_TUPLETABLESLOT_SLOW4//为slot_deform_tuple存储状态?booltts_slow;/*savedstateforslot_deform_tuple*/#defineFIELDNO_TUPLETABLESLOT_TUPLE5//物理元组,如为虚拟元组则为NULLHeapTupletts_tuple;/*physicaltuple,orNULLifvirtual*/#defineFIELDNO_TUPLETABLESLOT_TUPLEDESCRIPTOR6//slot中的元组描述符TupleDesctts_tupleDescriptor;/*slot'stupledescriptor*///slot所在的上下文MemoryContexttts_mcxt;/*slotitselfisinthiscontext*///元组缓存,如无则为InvalidBufferBuffertts_buffer;/*tuple'sbuffer,orInvalidBuffer*/#defineFIELDNO_TUPLETABLESLOT_NVALID9//tts_values中的有效值inttts_nvalid;/*#ofvalidvaluesintts_values*/#defineFIELDNO_TUPLETABLESLOT_VALUES10//当前每个属性的值Datum*tts_values;/*currentper-attributevalues*/#defineFIELDNO_TUPLETABLESLOT_ISNULL11//isnull数组bool*tts_isnull;/*currentper-attributeisnullflags*///minimal元组,如无则为NULLMinimalTupletts_mintuple;/*minimaltuple,orNULLifnone*///在minimal情况下的工作空间HeapTupleDatatts_minhdr;/*workspaceforminimal-tuple-onlycase*/#defineFIELDNO_TUPLETABLESLOT_OFF14//slot_deform_tuple的存储状态uint32tts_off;/*savedstateforslot_deform_tuple*///不能被变更的描述符(固定描述符)booltts_fixedTupleDescriptor;/*descriptorcan'tbechanged*/}TupleTableSlot;/*basetupletableslottype*/typedefstructTupleTableSlot{NodeTagtype;//Node标记#defineFIELDNO_TUPLETABLESLOT_FLAGS1uint16tts_flags;/*布尔状态;Booleanstates*/#defineFIELDNO_TUPLETABLESLOT_NVALID2AttrNumbertts_nvalid;/*在tts_values中有多少有效的values;#ofvalidvaluesintts_values*/constTupleTableSlotOps*consttts_ops;/*slot的实际实现;implementationofslot*/#defineFIELDNO_TUPLETABLESLOT_TUPLEDESCRIPTOR4TupleDesctts_tupleDescriptor;/*slot的元组描述符;slot'stupledescriptor*/#defineFIELDNO_TUPLETABLESLOT_VALUES5Datum*tts_values;/*当前属性值;currentper-attributevalues*/#defineFIELDNO_TUPLETABLESLOT_ISNULL6bool*tts_isnull;/*当前属性isnull标记;currentper-attributeisnullflags*/MemoryContexttts_mcxt;/*内存上下文;slotitselfisinthiscontext*/}TupleTableSlot;/*routinesforaTupleTableSlotimplementation*///TupleTableSlot的"小程序"structTupleTableSlotOps{/*Minimumsizeoftheslot*///slot的最小化大小size_tbase_slot_size;/*Initialization.*///初始化方法void(*init)(TupleTableSlot*slot);/*Destruction.*///析构方法void(*release)(TupleTableSlot*slot);/**Clearthecontentsoftheslot.Onlythecontentsareexpectedtobe*clearedandnotthetupledescriptor.Typicallyanimplementationof*thiscallbackshouldfreethememoryallocatedforthetuplecontained*intheslot.*清除slot中的内容。*只希望清除内容,而不希望清除元组描述符。*通常,这个回调的实现应该释放为slot中包含的元组分配的内存。*/void(*clear)(TupleTableSlot*slot);/**Fillupfirstnattsentriesoftts_valuesandtts_isnullarrayswith*valuesfromthetuplecontainedintheslot.Thefunctionmaybecalled*withnattsmorethanthenumberofattributesavailableinthetuple,*inwhichcaseitshouldsettts_nvalidtothenumberofreturned*columns.*用slot中包含的元组的值填充tts_values和tts_isnull数组的第一个natts条目。*在调用该函数时,natts可能多于元组中可用属性的数量,在这种情况下,*应该将tts_nvalid设置为返回列的数量。*/void(*getsomeattrs)(TupleTableSlot*slot,intnatts);/**Returnsvalueofthegivensystemattributeasadatumandsetsisnull*tofalse,ifit'snotNULL.Throwsanerroriftheslottypedoesnot*supportsystemattributes.*将给定系统属性的值作为基准返回,如果不为NULL,*则将isnull设置为false。如果slot类型不支持系统属性,则引发错误。*/Datum(*getsysattr)(TupleTableSlot*slot,intattnum,bool*isnull);/**Makethecontentsoftheslotsolelydependontheslot,andnoton*underlyingresources(likeanothermemorycontext,buffers,etc).*使slot的内容完全依赖于slot,而不是底层资源(如另一个内存上下文、缓冲区等)。*/void(*materialize)(TupleTableSlot*slot);/**Copythecontentsofthesourceslotintothedestinationslot'sown*context.Invokedusingcallbackofthedestinationslot.*将源slot的内容复制到目标slot自己的上下文中。*使用目标slot的回调函数调用。*/void(*copyslot)(TupleTableSlot*dstslot,TupleTableSlot*srcslot);/**Returnaheaptuple"owned"bytheslot.Itisslot'sresponsibilityto*freethememoryconsumedbytheheaptuple.Iftheslotcannot"own"a*heaptuple,itshouldnotimplementthiscallbackandshouldsetitas*NULL.*返回slot“拥有”的堆元组。*slot负责释放堆元组分配的内存。*如果slot不能“拥有”堆元组,它不应该实现这个回调函数,应该将它设置为NULL。*/HeapTuple(*get_heap_tuple)(TupleTableSlot*slot);/**Returnaminimaltuple"owned"bytheslot.Itisslot'sresponsibility*tofreethememoryconsumedbytheminimaltuple.Iftheslotcannot*"own"aminimaltuple,itshouldnotimplementthiscallbackandshould*setitasNULL.*返回slot“拥有”的最小元组。*slot负责释放最小元组分配的内存。*如果slot不能“拥有”最小元组,它不应该实现这个回调函数,应该将它设置为NULL。*/MinimalTuple(*get_minimal_tuple)(TupleTableSlot*slot);/**Returnacopyofheaptuplerepresentingthecontentsoftheslot.The*copyneedstobepalloc'dinthecurrentmemorycontext.Theslot*itselfisexpectedtoremainunaffected.Itis*not*expectedtohave*meaningful"systemcolumns"inthecopy.Thecopyisnotbe"owned"by*thesloti.e.thecallerhastotakeresponsibiltytofreememory*consumedbytheslot.*返回表示slot内容的堆元组副本。*需要在当前内存上下文中对副本进行内存分配palloc。*预计slot本身不会受到影响。*它不希望在副本中有有意义的“系统列”。副本不是slot“拥有”的,即调用方必须负责释放slot消耗的内存。*/HeapTuple(*copy_heap_tuple)(TupleTableSlot*slot);/**Returnacopyofminimaltuplerepresentingthecontentsoftheslot.The*copyneedstobepalloc'dinthecurrentmemorycontext.Theslot*itselfisexpectedtoremainunaffected.Itis*not*expectedtohave*meaningful"systemcolumns"inthecopy.Thecopyisnotbe"owned"by*thesloti.e.thecallerhastotakeresponsibiltytofreememory*consumedbytheslot.*返回表示slot内容的最小元组的副本。*需要在当前内存上下文中对副本进行palloc。*预计slot本身不会受到影响。*它不希望在副本中有有意义的“系统列”。副本不是slot“拥有”的,即调用方必须负责释放slot消耗的内存。*/MinimalTuple(*copy_minimal_tuple)(TupleTableSlot*slot);};typedefstructtupleDesc{intnatts;/*tuple中的属性数量;numberofattributesinthetuple*/Oidtdtypeid;/*tuple类型的组合类型ID;compositetypeIDfortupletype*/int32tdtypmod;/*tuple类型的typmode;typmodfortupletype*/inttdrefcount;/*依赖计数,如为-1,则没有依赖;referencecount,or-1ifnotcounting*/TupleConstr*constr;/*约束,如无则为NULL;constraints,orNULLifnone*//*attrs[N]isthedescriptionofAttributeNumberN+1*///attrs[N]是第N+1个属性的描述符FormData_pg_attributeattrs[FLEXIBLE_ARRAY_MEMBER];}*TupleDesc;

SortState
排序运行期状态信息

/*----------------*SortStateinformation*排序运行期状态信息*----------------*/typedefstructSortState{//基类ScanStatess;/*itsfirstfieldisNodeTag*///是否需要随机访问排序输出?boolrandomAccess;/*needrandomaccesstosortoutput?*///结果集是否存在边界?boolbounded;/*istheresultsetbounded?*///如存在边界,需要多少个元组?int64bound;/*ifbounded,howmanytuplesareneeded*///是否已完成排序?boolsort_Done;/*sortcompletedyet?*///是否使用有界值?boolbounded_Done;/*valueofboundedwedidthesortwith*///使用的有界值?int64bound_Done;/*valueofboundwedidthesortwith*///tuplesort.c的私有状态void*tuplesortstate;/*privatestateoftuplesort.c*///是否worker?boolam_worker;/*areweaworker?*///每个worker对应一个条目SharedSortInfo*shared_info;/*oneentryperworker*/}SortState;/*----------------*Sharedmemorycontainerforper-workersortinformation*per-worker排序信息的共享内存容器*----------------*/typedefstructSharedSortInfo{//worker个数?intnum_workers;//排序机制TuplesortInstrumentationsinstrument[FLEXIBLE_ARRAY_MEMBER];}SharedSortInfo;

TuplesortInstrumentation
报告排序统计的数据结构.

/**Datastructuresforreportingsortstatistics.Notethat*TuplesortInstrumentationcan'tcontainanypointersbecausewe*sometimesputitinsharedmemory.*报告排序统计的数据结构.*注意TuplesortInstrumentation不能包含指针因为有时候会把该结构体放在共享内存中.*/typedefenum{SORT_TYPE_STILL_IN_PROGRESS=0,//仍然在排序中SORT_TYPE_TOP_N_HEAPSORT,//TOPN堆排序SORT_TYPE_QUICKSORT,//快速排序SORT_TYPE_EXTERNAL_SORT,//外排序SORT_TYPE_EXTERNAL_MERGE//外排序后的合并}TuplesortMethod;//排序方法typedefenum{SORT_SPACE_TYPE_DISK,//需要用上磁盘SORT_SPACE_TYPE_MEMORY//使用内存}TuplesortSpaceType;typedefstructTuplesortInstrumentation{//使用的排序算法TuplesortMethodsortMethod;/*sortalgorithmused*///排序使用空间类型TuplesortSpaceTypespaceType;/*typeofspacespaceUsedrepresents*///空间消耗(以K为单位)longspaceUsed;/*spaceconsumption,inkB*/}TuplesortInstrumentation;二、源码解读

mergeruns归并所有已完成初始轮的数据.

/**mergeruns--mergeallthecompletedinitialruns.*mergeruns--归并所有已完成的数据.**ThisimplementsstepsD5,D6ofAlgorithmD.Allinputdatahas*alreadybeenwrittentoinitialrunsontape(seedumptuples).*实现了算法D中的D5和D6.*所有输入数据已写入到磁盘上(dumptuples函数负责完成).*/staticvoidmergeruns(Tuplesortstate*state){inttapenum,svTape,svRuns,svDummy;intnumTapes;intnumInputTapes;Assert(state->status==TSS_BUILDRUNS);Assert(state->memtupcount==0);if(state->sortKeys!=NULL&&state->sortKeys->abbrev_converter!=NULL){/**Iftherearemultiplerunstobemerged,whenwegotoreadback*tuplesfromdisk,abbreviatedkeyswillnothavebeenstored,and*wedon'tcaretoregeneratethem.Disableabbreviationfromthis*pointon.*如果从磁盘上读回元组时存在多个运行需要被归并,*缩写键不会被存储,并不关系是否需要重新生成它们.*在这一刻起,禁用缩写.*/state->sortKeys->abbrev_converter=NULL;state->sortKeys->comparator=state->sortKeys->abbrev_full_comparator;/*Notstrictlynecessary,butbetidy*///非严格性需要,但需要tidystate->sortKeys->abbrev_abort=NULL;state->sortKeys->abbrev_full_comparator=NULL;}/**Resettuplememory.We'vefreedallthetuplesthatwepreviously*allocated.Wewillusetheslaballocatorfromnowon.*重置元组内存.*已释放了先前分配的内存.从现在起使用slab分配器.*/MemoryContextDelete(state->tuplecontext);state->tuplecontext=NULL;/**Wenolongerneedalargememtuplesarray.(Wewillallocateasmaller*onefortheheaplater.)*不再需要大块的memtuples数组.(将为后面的堆分配更小块的内存)*/FREEMEM(state,GetMemoryChunkSpace(state->memtuples));pfree(state->memtuples);state->memtuples=NULL;/**Ifwehadfewerrunsthantapes,refundthememorythatweimaginedwe*wouldneedforthetapebuffersoftheunusedtapes.*比起tapes,如果runs要少,退还我们认为需要用于tape缓存但其实用不上的内存.**numTapesandnumInputTapesreflecttheactualnumberoftapeswewill*use.Notethattheoutputtape'stapenumberismaxTapes-1,sothe*tapenumbersoftheusedtapesarenotconsecutive,andyoucannotjust*loopfrom0tonumTapestovisitallusedtapes!*numTapes和numInputTapes反映了实际的使用tapes数.*注意输出的tape编号是maxTapes-1,因此已使用的tape编号不是连续的,*不能简单的从0-numTapes循环访问所有已使用的tapes.*/if(state->Level==1){numInputTapes=state->currentRun;numTapes=numInputTapes+1;FREEMEM(state,(state->maxTapes-numTapes)*TAPE_BUFFER_OVERHEAD);}else{numInputTapes=state->tapeRange;numTapes=state->maxTapes;}/**Initializetheslaballocator.Weneedoneslabslotperinputtape,*forthetuplesintheheap,plusonetoholdthetuplelastreturned*fromtuplesort_gettuple.(Ifwe'resortingpass-by-valDatums,*however,wedon'tneedtodoallocateanything.)*初始化slab分配器.每一个输入的tape都有一个slabslot,对于堆中的元组,*外加1用于保存最后从tuplesort_gettuple返回的元组.*(但是,如果通过传值的方式传递Datums,不需要执行内存分配)**Fromthispointon,wenolongerusetheUSEMEM()/LACKMEM()mechanism*totrackmemoryusageofindividualtuples.*从这点起,不再使用USEMEM()/LACKMEM()这种机制来跟踪独立元组的内存使用.*/if(state->tuples)init_slab_allocator(state,numInputTapes+1);elseinit_slab_allocator(state,0);/**Allocateanew'memtuples'array,fortheheap.Itwillholdonetuple*fromeachinputtape.*为堆分配新的'memtuples'数组*对于每一个输入的tape,都会保存有一个元组.*/state->memtupsize=numInputTapes;state->memtuples=(SortTuple*)palloc(numInputTapes*sizeof(SortTuple));USEMEM(state,GetMemoryChunkSpace(state->memtuples));/**Usealltheremainingmemorywehaveavailableforreadbuffersamong*theinputtapes.*使用所有可使用的剩余内存读取输入tapes之间的缓存.**Wedon'ttryto"rebalance"thememoryamongtapes,whenwestartanew*mergephase,evenifsometapesareinactiveinthenewphase.That*wouldbehard,becauselogtape.cdoesn'tknowwhereonerunendsand*anotherbegins.Whenanewmergephasebegins,andatapedoesn't*participateinit,itsbufferneverthelessalreadycontainstuplesfrom*thenextrunonsametape,sowecannotreleasethebuffer.That'sOK*inpractice,mergeperformanceisn'tthatsensitivetotheamountof*buffersused,andmostmergephasesusealloralmostalltapes,*anyway.*在新的阶段就算存在某些tapes不再活动,在开始新的归并阶段时,不再尝试在tapes之间重平衡内存.*这是比较难以实现的,因为logtape.c不知道某个运行在哪里结束了,那个运行在哪里开始.*在新的归并阶段开始时,tape不需要分享,尽管如此,它的缓冲区已包含来自同一tape上下一次运行需要的元组,*因此不需要释放缓冲区.*实践中,这是没有问题的,归并的性能对于缓存的使用不是性能敏感的,大多数归并阶段使用所有或大多数的tapes.*/#ifdefTRACE_SORTif(trace_sort)elog(LOG,"worker%dusing"INT64_FORMAT"KBofmemoryforreadbuffersamong%dinputtapes",state->worker,state->availMem/1024,numInputTapes);#endifstate->read_buffer_size=Max(state->availMem/numInputTapes,0);USEMEM(state,state->read_buffer_size*numInputTapes);/*EndofstepD2:rewindalloutputtapestoprepareformerging*///D2完成,倒回所有输出tapes准备归并for(tapenum=0;tapenum<state->tapeRange;tapenum++)LogicalTapeRewindForRead(state->tapeset,tapenum,state->read_buffer_size);for(;;){//-------------循环/**Atthispointweknowthattape[T]isempty.Ifthere'sjustone*(realordummy)runleftoneachinputtape,thenonlyonemerge*passremains.Ifwedon'thavetoproduceamaterializedsorted*tape,wecanstopatthispointanddothefinalmergeon-the-fly.*在这时候,我们已知tape[T]是空的.*如果正好在每一个输入tape上只剩下某个run(实际或者虚拟的),那么只剩下一次归并.*如果不需要产生物化排序后的tape,这时候可以停止并执行内存中的最终归并.*/if(!state->randomAccess&&!WORKER(state)){boolallOneRun=true;Assert(state->tp_runs[state->tapeRange]==0);for(tapenum=0;tapenum<state->tapeRange;tapenum++){if(state->tp_runs[tapenum]+state->tp_dummy[tapenum]!=1){allOneRun=false;break;}}if(allOneRun){/*Telllogtape.cwewon'tbewritinganymore*///通知logtape.c,不再写入.LogicalTapeSetForgetFreeSpace(state->tapeset);/*Initializeforthefinalmergepass*///为最终的归并做准备beginmerge(state);state->status=TSS_FINALMERGE;return;}}/*StepD5:mergerunsontotape[T]untiltape[P]isempty*///步骤D5:归并runs到tape[T]中直至tape[P]为空while(state->tp_runs[state->tapeRange-1]||state->tp_dummy[state->tapeRange-1]){boolallDummy=true;for(tapenum=0;tapenum<state->tapeRange;tapenum++){if(state->tp_dummy[tapenum]==0){allDummy=false;break;}}if(allDummy){state->tp_dummy[state->tapeRange]++;for(tapenum=0;tapenum<state->tapeRange;tapenum++)state->tp_dummy[tapenum]--;}elsemergeonerun(state);}/*StepD6:decreaselevel*///步骤D6:往上层汇总if(--state->Level==0)break;/*rewindoutputtapeTtouseasnewinput*///倒回输入的TapeT作为新的输入LogicalTapeRewindForRead(state->tapeset,state->tp_tapenum[state->tapeRange],state->read_buffer_size);/*rewindused-upinputtapeP,andprepareitforwritepass*///倒回使用上的输入tapeP,并为写入轮准备LogicalTapeRewindForWrite(state->tapeset,state->tp_tapenum[state->tapeRange-1]);state->tp_runs[state->tapeRange-1]=0;/**reassigntapeunitsperstepD6;notewenolongercareaboutA[]*每一个步骤D6,重分配tape单元.*注意我们不再关心A[]了.*/svTape=state->tp_tapenum[state->tapeRange];svDummy=state->tp_dummy[state->tapeRange];svRuns=state->tp_runs[state->tapeRange];for(tapenum=state->tapeRange;tapenum>0;tapenum--){state->tp_tapenum[tapenum]=state->tp_tapenum[tapenum-1];state->tp_dummy[tapenum]=state->tp_dummy[tapenum-1];state->tp_runs[tapenum]=state->tp_runs[tapenum-1];}state->tp_tapenum[0]=svTape;state->tp_dummy[0]=svDummy;state->tp_runs[0]=svRuns;}/**Done.KnuthsaysthattheresultisonTAPE[1],butsinceweexited*theloopwithoutperformingthelastiterationofstepD6,wehavenot*rearrangedthetapeunitassignment,andthereforetheresultison*TAPE[T].Weneedtodoitthiswaysothatwecanfreezethefinal*outputtapewhilerewindingit.ThelastiterationofstepD6wouldbe*awasteofcyclesanyway...*大功告成!结果位于TAPE[1]中,但因为没有执行步骤D6中最后一个迭代就退出了循环,*因此不需要重新整理tape单元分配,因此结果在TAPE[T]中.*通过这种方法来处理一遍可以在倒回时冻结结果输出TAPE.*步骤D6的最后一轮迭代会是浪费.*/state->result_tape=state->tp_tapenum[state->tapeRange];if(!WORKER(state))LogicalTapeFreeze(state->tapeset,state->result_tape,NULL);elseworker_freeze_result_tape(state);state->status=TSS_SORTEDONTAPE;/*Releasethereadbuffersofalltheothertapes,byrewindingthem.*///通过倒回tapes,释放所有其他tapes的读缓存for(tapenum=0;tapenum<state->maxTapes;tapenum++){if(tapenum!=state->result_tape)LogicalTapeRewindForWrite(state->tapeset,tapenum);}}三、跟踪分析

测试脚本

select*fromt_sortorderbyc1,c2;

跟踪分析

(gdb)bmergerunsBreakpoint1at0xa73508:filetuplesort.c,line2570.(gdb)Note:breakpoint1alsosetatpc0xa73508.Breakpoint2at0xa73508:filetuplesort.c,line2570.

输入参数

(gdb)cContinuing.Breakpoint1,mergeruns(state=0x2b808a8)attuplesort.c:25702570Assert(state->status==TSS_BUILDRUNS);(gdb)p*state$1={status=TSS_BUILDRUNS,nKeys=2,randomAccess=false,bounded=false,boundUsed=false,bound=0,tuples=true,availMem=3164456,allowedMem=4194304,maxTapes=16,tapeRange=15,sortcontext=0x2b80790,tuplecontext=0x2b827a0,tapeset=0x2b81480,comparetup=0xa7525b<comparetup_heap>,copytup=0xa76247<copytup_heap>,writetup=0xa76de1<writetup_heap>,readtup=0xa76ec6<readtup_heap>,memtuples=0x7f0cfeb14050,memtupcount=0,memtupsize=37448,growmemtuples=false,slabAllocatorUsed=false,slabMemoryBegin=0x0,slabMemoryEnd=0x0,slabFreeHead=0x0,read_buffer_size=0,lastReturnedTuple=0x0,currentRun=3,mergeactive=0x2b81350,Level=1,destTape=2,tp_fib=0x2b80d58,tp_runs=0x2b81378,tp_dummy=0x2b813d0,tp_tapenum=0x2b81428,activeTapes=0,result_tape=-1,current=0,eof_reached=false,markpos_block=0,markpos_offset=0,markpos_eof=false,worker=-1,shared=0x0,nParticipants=-1,tupDesc=0x2b67ae0,sortKeys=0x2b80cc0,onlyKey=0x0,abbrevNext=10,indexInfo=0x0,estate=0x0,heapRel=0x0,indexRel=0x0,enforceUnique=false,high_mask=0,low_mask=0,max_buckets=0,datumType=0,datumTypeLen=0,ru_start={tv={tv_sec=0,tv_usec=0},ru={ru_utime={tv_sec=0,tv_usec=0},ru_stime={tv_sec=0,tv_usec=0},{ru_maxrss=0,__ru_maxrss_word=0},{ru_ixrss=0,__ru_ixrss_word=0},{ru_idrss=0,__ru_idrss_word=0},{ru_isrss=0,__ru_isrss_word=0},{ru_minflt=0,__ru_minflt_word=0},{ru_majflt=0,__ru_majflt_word=0},{ru_nswap=0,__ru_nswap_word=0},{ru_inblock=0,__ru_inblock_word=0},{ru_oublock=0,__ru_oublock_word=0},{ru_msgsnd=0,__ru_msgsnd_word=0},{ru_msgrcv=0,__ru_msgrcv_word=0},{ru_nsignals=0,__ru_nsignals_word=0},{ru_nvcsw=0,__ru_nvcsw_word=0},{ru_nivcsw=0,__ru_nivcsw_word=0}}}}(gdb)

排序键等信息

(gdb)n2571Assert(state->memtupcount==0);(gdb)2573if(state->sortKeys!=NULL&&state->sortKeys->abbrev_converter!=NULL)(gdb)p*state->sortKeys$2={ssup_cxt=0x2b80790,ssup_collation=0,ssup_reverse=false,ssup_nulls_first=false,ssup_attno=2,ssup_extra=0x0,comparator=0x4fd4af<btint4fastcmp>,abbreviate=true,abbrev_converter=0x0,abbrev_abort=0x0,abbrev_full_comparator=0x0}(gdb)p*state->sortKeys->abbrev_converterCannotaccessmemoryataddress0x0

重置元组内存,不再需要大块的memtuples数组.

(gdb)n2593MemoryContextDelete(state->tuplecontext);(gdb)2594state->tuplecontext=NULL;(gdb)(gdb)n2600FREEMEM(state,GetMemoryChunkSpace(state->memtuples));(gdb)2601pfree(state->memtuples);(gdb)2602state->memtuples=NULL;(gdb)2613if(state->Level==1)(gdb)

计算Tapes数

(gdb)n2615numInputTapes=state->currentRun;(gdb)pstate->currentRun$3=3(gdb)pstate->Level$4=1(gdb)pstate->tapeRange$5=15(gdb)pstate->maxTapes$6=16(gdb)n2616numTapes=numInputTapes+1;(gdb)2617FREEMEM(state,(state->maxTapes-numTapes)*TAPE_BUFFER_OVERHEAD);(gdb)2634if(state->tuples)(gdb)pnumInputTapes$7=3(gdb)pnumTapes$8=4(gdb)

初始化slab分配器/为堆分配新的’memtuples’数组/倒回所有输出tapes准备归并

(gdb)n2635init_slab_allocator(state,numInputTapes+1);(gdb)n2643state->memtupsize=numInputTapes;(gdb)2644state->memtuples=(SortTuple*)palloc(numInputTapes*sizeof(SortTuple));(gdb)2645USEMEM(state,GetMemoryChunkSpace(state->memtuples));(gdb)pstate->memtupsize$9=3(gdb)n2662if(trace_sort)(gdb)2667state->read_buffer_size=Max(state->availMem/numInputTapes,0);(gdb)2668USEMEM(state,state->read_buffer_size*numInputTapes);(gdb)pstate->read_buffer_size$10=1385762(gdb)n2671for(tapenum=0;tapenum<state->tapeRange;tapenum++)(gdb)2672LogicalTapeRewindForRead(state->tapeset,tapenum,state->read_buffer_size);(gdb)pstate->tapeRange$11=15(gdb)pstate->status$12=TSS_BUILDRUNS(gdb)

进入循环

2671for(tapenum=0;tapenum<state->tapeRange;tapenum++)(gdb)2682if(!state->randomAccess&&!WORKER(state))(gdb)2684boolallOneRun=true;(gdb)pstate->randomAccess$15=false(gdb)pWORKER(state)$16=0(gdb)

循环判断allOneRun是否为F

2687for(tapenum=0;tapenum<state->tapeRange;tapenum++)(gdb)2695if(allOneRun)(gdb)pallOneRun$19=true(gdb)

开始归并,并设置状态,返回

(gdb)n2698LogicalTapeSetForgetFreeSpace(state->tapeset);(gdb)2700beginmerge(state);(gdb)2701state->status=TSS_FINALMERGE;(gdb)2702return;(gdb)2779}(gdb)tuplesort_performsort(state=0x2b808a8)attuplesort.c:18661866state->eof_reached=false;(gdb)

完成排序

(gdb)n1867state->markpos_block=0L;(gdb)1868state->markpos_offset=0;(gdb)1869state->markpos_eof=false;(gdb)1870break;(gdb)1878if(trace_sort)(gdb)1890MemoryContextSwitchTo(oldcontext);(gdb)1891}(gdb)ExecSort(pstate=0x2b67640)atnodeSort.c:123123estate->es_direction=dir;(gdb)cContinuing.

到此,关于“PostgreSQL怎么调用mergeruns函数”的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注亿速云网站,小编会继续努力为大家带来更多实用的文章!