PostgreSQL中的Tuplesortstate数据结构是怎样的

2025-03-13 技术教程

本篇内容主要讲解“PostgreSQL中的Tuplesortstate数据结构是怎样的”，感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷，实用性强。下面就让小编来带大家学习“PostgreSQL中的Tuplesortstate数据结构是怎样的”吧!

Tuplesortstate
Tuplesort操作的私有状态.

/**PossiblestatesofaTuplesortobject.Thesedenotethestatesthat*persistbetweencallsofTuplesortroutines.*Tuplesort对象可能的状态.*这些标示在Tuplesort例程调用之间会持久存在的状态.*/typedefenum{//装载元组,在内存限制之内TSS_INITIAL,/*Loadingtuples;stillwithinmemorylimit*///装载元组到有界堆中TSS_BOUNDED,/*Loadingtuplesintobounded-sizeheap*///装载元组,写入到tape中TSS_BUILDRUNS,/*Loadingtuples;writingtotape*///完全在内存中完成排序TSS_SORTEDINMEM,/*Sortcompletedentirelyinmemory*///完成排序,最后在tape上执行排序TSS_SORTEDONTAPE,/*Sortcompleted,finalrunisontape*///不落地执行最后的归并TSS_FINALMERGE/*Performingfinalmergeon-the-fly*/}TupSortStatus;/**Parametersforcalculationofnumberoftapestouse---seeinittapes()*andtuplesort_merge_order().*用于计算需要使用多少个tapes的参数.---详细参见inittapes()和tuplesort_merge_order().**Inthiscalculationweassumethateachtapewillcostusabout1blocks*worthofbufferspace.Thisignorestheoverheadofalltheotherdata*structuresneededforeachtape,butit'sprobablycloseenough.*在这个计算中,我们假定每一个tape会大概消耗缓存空间的一个block.*虽然已经忽略了所有每个tape依赖的其他数据结构,但已经非常接近了.**MERGE_BUFFER_SIZEishowmuchdatawe'dliketoreadfromeachinput*tapeduringaprereadcycle(seediscussionattopoffile).*MERGE_BUFFER_SIZE表示在每一轮读周期中我们将要从每个输入taple中读取的数据大小*/#defineMINORDER6/*minimummergeorder*/#defineMAXORDER500/*maximummergeorder*/#defineTAPE_BUFFER_OVERHEADBLCKSZ#defineMERGE_BUFFER_SIZE(BLCKSZ*32)typedefint(*SortTupleComparator)(constSortTuple*a,constSortTuple*b,Tuplesortstate*state);/**PrivatestateofaTuplesortoperation.*Tuplesort操作的私有状态.*/structTuplesortstate{//状态:枚举值详见上面的信息TupSortStatusstatus;/*enumeratedvalueasshownabove*///sortkey中的列数intnKeys;/*numberofcolumnsinsortkey*///调用者需要随机访问?boolrandomAccess;/*didcallerrequestrandomaccess?*///调用者是否指定了最大返回的元组的数目?boolbounded;/*didcallerspecifyamaximumnumberof*tuplestoreturn?*///使用有界堆,则返回TboolboundUsed;/*trueifwemadeuseofaboundedheap*///如为有界堆,这里存储最大的元组个数intbound;/*ifbounded,themaximumnumberoftuples*///SortTuple.tuple是否可以设置?booltuples;/*CanSortTuple.tupleeverbeset?*///剩余可用内存大小(单位:字节)int64availMem;/*remainingmemoryavailable,inbytes*///允许的内存总大小(单位:字节)int64allowedMem;/*totalmemoryallowed,inbytes*///tapes个数intmaxTapes;/*numberoftapes(Knuth'sT)*///tapes个数-1inttapeRange;/*maxTapes-1(Knuth'sP)*///主要用于排序数据的内存上下文MemoryContextsortcontext;/*memorycontextholdingmostsortdata*///用于元组数据的sortcontext的子上下文MemoryContexttuplecontext;/*sub-contextofsortcontextfortupledata*///临时文件中tapes的logtape.c对象LogicalTapeSet*tapeset;/*logtape.cobjectfortapesinatempfile*//**Thesefunctionpointersdecoupletheroutinesthatmustknowwhatkind*oftuplewearesortingfromtheroutinesthatdon'tneedtoknowit.*Theyaresetupbythetuplesort_begin_xxxroutines.*这些函数指针将必须知道排序的哪种元组的例程与不需要知道它的例程解耦.**Functiontocomparetwotuples;resultisperqsort()convention,ie:*<0,0,>0accordingasa<b,a=b,a>b.TheAPImustmatch*qsort_arg_comparator.*比较两个元组的函数,结果由每个qsort()约定,比如:*<0,0,>0代表a<b,a=b,a>b.API必须匹配qsort_arg_comparator.*/SortTupleComparatorcomparetup;/**Functiontocopyasuppliedinputtupleintopalloc'dspaceandsetup*itsSortTuplerepresentation(ie,settuple/datum1/isnull1).Also,*state->availMemmustbedecreasedbytheamountofspaceusedforthe*tuplecopy(notetheSortTuplestructitselfisnotcounted).*该函数用于拷贝一个输入的元组到由palloc分配的内存空间中,*同时设置SortTuple数据结构(比如设置tuple/datum1/isnull1等).*同时,state->availMem必须减去用于元组拷贝的空间大小(注意:SortTuple结构体不计算在内).*/void(*copytup)(Tuplesortstate*state,SortTuple*stup,void*tup);/**Functiontowriteastoredtupleontotape.Therepresentationofthe*tupleontapeneednotbethesameasitisinmemory;requirementson*thetaperepresentationaregivenbelow.Unlesstheslaballocatoris*used,afterwritingthetuple,pfree()theout-of-linedata(notthe*SortTuplestruct!),andincreasestate->availMembytheamountof*memoryspacetherebyreleased.*用于写入元组到taple的函数.*tape中的元组声明不需要与内存中的一致,tape中的声明要求详见下面说明.*除非使用slab分配器,在写入元组后,pfree()out-of-line的数据(不是SortTuple结构体),*同时把刚才释放的内存空间加到state->availMem中.*/void(*writetup)(Tuplesortstate*state,inttapenum,SortTuple*stup);/**Functiontoreadastoredtuplefromtapebackintomemory.'len'is*thealready-readlengthofthestoredtuple.Thetupleisallocated*fromtheslabmemoryarena,orispalloc'd,seereadtup_alloc().*从tape中读取元组到内存中的函数.*'len'是已读取的存储元组的长度.元组在slab内存空间/palloc中分配,详细参考readtup_alloc()函数*/void(*readtup)(Tuplesortstate*state,SortTuple*stup,inttapenum,unsignedintlen);/**Thisarrayholdsthetuplesnowinsortmemory.Ifweareinstate*INITIAL,thetuplesareinnoparticularorder;ifweareinstate*SORTEDINMEM,thetuplesareinfinalsortedorder;instatesBUILDRUNS*andFINALMERGE,thetuplesareorganizedin"heap"orderperAlgorithm*H.InstateSORTEDONTAPE,thearrayisnotused.*该数组保存排序内存中的元组.当前状态为*INITIAL:元组没有特定的顺序;*SORTEDINMEM:元组处于最终已排序的状态;*BUILDRUNS/FINALMERGE:元组按算法H的'堆'顺序组织.*SORTEDONTAPE:数组未使用.*///SortTuple结构体数组SortTuple*memtuples;/*arrayofSortTuplestructs*///当前存在的元组数intmemtupcount;/*numberoftuplescurrentlypresent*///memtuples数组的已分配的大小intmemtupsize;/*allocatedlengthofmemtuplesarray*///memtuples的增长仍在进行中?boolgrowmemtuples;/*memtuples'growthstillunderway?*//**Memoryfortuplesissometimesallocatedusingasimpleslaballocator,*ratherthanwithpalloc().Currently,weswitchtoslaballocation*whenwestartmerging.Mergingonlyneedstokeepasmall,fixed*numberoftuplesinmemoryatanytime,sowecanavoidthe*palloc/pfreeoverheadbyrecyclingafixednumberoffixed-sizeslots*toholdthetuples.*有时候元组的内存分配使用简单的slab分配器实现而不是palloc().*在开始归并时,同步切换至slab分配器.归并只需要在内存中保持简单/固定的元组数目,*因此可以避免频繁回收固定数目固定大小的slots(用于保存元组)而导致的palloc/pfree过载.**Fortheslab,weuseonelargeallocation,dividedintoSLAB_SLOT_SIZE*slots.Theallocationissizedtohaveoneslotpertape,plusone*additionalslot.Weneedthatmanyslotstoholdallthetupleskept*intheheapduringmerge,plustheonewehavelastreturnedfromthe*sort,withtuplesort_gettuple.*对于slab,使用大型分配器,拆分为SLAB_SLOT_SIZE个大小的slots.*分配器的大小为每个tape一个slot,外加一个为的slot.*我们需要这么多slots是因为在归并期间需要保存所有在堆中的元组,外加tuplesort_gettuple从排序中最终返回的元组**Initially,alltheslotsarekeptinalinkedlistoffreeslots.When*atupleisreadfromatape,itisputtothenextavailableslot,if*itfits.IfthetupleislargerthanSLAB_SLOT_SIZE,itispalloc'd*instead.*一开始,所有的slots在空闲slots中以链表的方式保存.*从tape中读取元组时,如合适的话,元组会放到下一个可用slot中.*如果元组比SLAB_SLOT_SIZE要大,改用palloc分配内存空间.**Whenwe'redoneprocessingatuple,wereturntheslotbacktothefree*list,orpfree()ifitwaspalloc'd.Weknowthatatuplewas*allocatedfromtheslab,ifitspointervalueisbetween*slabMemoryBeginand-End.*如果已完成元组的处理,返回slot到空闲链表中,如果使用palloc分配则使用pfree回收空间.*如果元组指针值在slabMemoryBegin和slabMemoryEnd之间,那么我们可以知道元组是从slab中分配的.**Whentheslaballocatorisused,theUSEMEM/LACKMEMmechanismof*trackingmemoryusageisnotused.*如使用了slab分配器,不会使用USEMEM/LACKMEM机制跟踪内存使用.*/boolslabAllocatorUsed;//slab内存空间的起始位置char*slabMemoryBegin;/*beginningofslabmemoryarena*///slab内存空间的结束位置char*slabMemoryEnd;/*endofslabmemoryarena*///链表头SlabSlot*slabFreeHead;/*headoffreelist*//*Buffersizetouseforreadinginputtapes,duringmerge.*///在归并期间用于读取输入tapes的缓存大小size_tread_buffer_size;/**Whenwereturnatupletothecallerintuplesort_gettuple_XXX,that*camefromatape(thatis,inTSS_SORTEDONTAPEorTSS_FINALMERGE*modes),werememberthetuplein'lastReturnedTuple',sothatwecan*recyclethememoryonnextgettuplecall.*通过tuplesort_gettuple_XXX方法调用返回元组给调用者时,元组从tape中获取*(也就是说,TSS_SORTEDONTAPE/TSS_FINALMERGE模式),这时候会把元组放在'lastReturnedTuple'中,*因此可在下次gettuple调用中回收内存.*/void*lastReturnedTuple;/**Whilebuildinginitialruns,thisisthecurrentoutputrunnumber.*Afterwards,itisthenumberofinitialrunswemade.*在构建initial运行时,这是当前输出的run编号.*后续这是我们构建好的initialruns的编号.*/intcurrentRun;/**Unlessotherwisenoted,allpointervariablesbelowarepointersto*arraysoflengthmaxTapes,holdingper-tapedata.*除非特别注意,下面所有的指针变量是指向长度为maxTapes的数组,保存per-tape数据.*//**Thisvariableisonlyusedduringmergepasses.mergeactive[i]istrue*ifwearereadinganinputrunfrom(actual)tapenumberiandhavenot*yetexhaustedthatrun.*该变量在归并过程中使用.*mergeactive[i]为T如果我们从编号为i的tape中读取数据,仍未在该run中消耗完毕.*///活跃的输入run源?bool*mergeactive;/*activeinputrunsource?*//**VariablesforAlgorithmD.NotethatdestTapeisa"logical"tape*number,ie,anindexintothetp_xxx[]arrays.Becarefultokeep*"logical"and"actual"tapenumbersstraight!*用于算法D的变量.*注意destTape是一个逻辑tape编号,例如,是指向tp_xxx[]数组的索引.*注意保持"逻辑"和"实际"tape编号的连续性.*///Knuth'slintLevel;/*Knuth'sl*///当前输出tape(Knuth'sj)intdestTape;/*currentoutputtape(Knuth'sj,less1)*///目标斐波拉契run计数(A[])int*tp_fib;/*TargetFibonacciruncounts(A[])*///每一个tape上真正runs的编号int*tp_runs;/*#ofrealrunsoneachtape*///每一个tape(D[])上虚拟runs的编号int*tp_dummy;/*#ofdummyrunsforeachtape(D[])*///实际的tape编号(TAPE[])int*tp_tapenum;/*Actualtapenumbers(TAPE[])*///归并轮中的活动输入tapes编号intactiveTapes;/*#ofactiveinputtapesinmergepass*//**Thesevariablesareusedaftercompletionofsortingtokeeptrackof*thenexttupletoreturn.(Inthetapecase,thetape'scurrentread*positionisalsocriticalstate.)*这些变量用于在排序完成后保持下一个返回元组时的跟踪.*(在tape情况下,tape的当前读取位置也是重要的状态)*///已完成输出的实际tape编号intresult_tape;/*actualtapenumberoffinishedoutput*///数组编号(仅用于SORTEDINMEM)intcurrent;/*arrayindex(onlyusedifSORTEDINMEM)*///是否到达EOF(用于游标)booleof_reached;/*reachedEOF(neededforcursors)*//*markpos_xxxholdsmarkedpositionformarkandrestore*///markpos_xxx保持已标记的位置,用于标记和存储//tapeblock编号(只用于SORTEDONTAPE)longmarkpos_block;/*tapeblock#(onlyusedifSORTEDONTAPE)*///存储的"current",或者tape块中的偏移intmarkpos_offset;/*saved"current",oroffsetintapeblock*///存储的eof_reachedboolmarkpos_eof;/*saved"eof_reached"*//**Thesevariablesareusedduringparallelsorting.*这些变量用于并行排序.**workerisourworkeridentifier.Followsthegeneralconventionthat*-1valuerelatestoaleadertuplesort,andvalues>=0worker*tuplesorts.(-1canalsobeaserialtuplesort.)*worker是worker标识符ID.*遵循一般约定，-1值与leadertuplesort相关，并且值>=0表示workertuplesorts。*(在串行tuplesort时,-1也可以表示这种情况)**sharedismutablesharedmemorystate,whichisusedtocoordinate*parallelsorts.*shared是可变的共享内存状态,用于协调并行排序.**nParticipantsisthenumberofworkerTuplesortstatesknownbythe*leadertohaveactuallybeenlaunched,whichimpliesthattheymust*finisharunleadercanmerge.Typicallyincludesaworkerstateheld*bytheleaderprocessitself.SetintheleaderTuplesortstateonly.*nParticipants是已知的workerTuplesortstates的数目,这些worker由leader感知,是实际启动的worker数,*这也意味着在leader可以归并前这些worker必须完成.*典型的,leader进行自身包含至少一个worker状态.*只在leader的Tuplesortstate中设置.*/intworker;Sharedsort*shared;intnParticipants;/**ThesortKeysvariableisusedbyeverycaseotherthanthehashindex*case;itissetbytuplesort_begin_xxx.tupDescisonlyusedbythe*MinimalTupleandCLUSTERroutines,though.*sortKeys变量用于every而不是hashindex,通过tuplesort_begin_xxx设置.*tupDesc只由MinimalTuple和CLUSTER例程使用.*/TupleDesctupDesc;//长度nKeys数组SortSupportsortKeys;/*arrayoflengthnKeys*//**Thisvariableissharedbythesingle-keyMinimalTuplecaseandthe*Datumcase(whichbothuseqsort_ssup()).Otherwiseit'sNULL.*该变量在单键MinimalTuple和Datum情况下(使用qsort_ssup()函数)共享使用,否则的话值为NULL.*/SortSupportonlyKey;/**Additionalstateformanaging"abbreviatedkey"sortsupportroutines*(whichcurrentlymaybeusedbyallcasesexceptthehashindexcase).*Trackstheintervalsatwhichtheoptimization'seffectivenessis*tested.*管理"缩写键"sortsupport过程的额外状态.*(除了hashindex外会被其他情况使用)*跟踪在优化器有效性测试时时间间隔.*/int64abbrevNext;/*Tuple#atwhichtonextcheck*applicability*//**ThesevariablesarespecifictotheCLUSTERcase;theyaresetby*tuplesort_begin_cluster.*这些变量仅在CLUSTER时生效,通过tuplesort_begin_cluster设置.*///将用于依赖的索引信息IndexInfo*indexInfo;/*infoaboutindexbeingusedforreference*///解析索引表达式的运行期状态EState*estate;/*forevaluatingindexexpressions*//**ThesevariablesarespecifictotheIndexTuplecase;theyaresetby*tuplesort_begin_index_xxxandusedonlybytheIndexTupleroutines.*这些变量仅用于IndexTuple.*通过tuplesort_begin_index_xxx设置,仅用于IndexTuple例程.*///数据表RelationheapRel;/*tabletheindexisbeingbuilton*///正在创建的indexRelationindexRel;/*indexbeingbuilt*//*Thesearespecifictotheindex_btreesubcase:*///这些仅在index_btree下使用//如发现重复元组,则提示boolenforceUnique;/*complainifwefindduplicatetuples*//*Thesearespecifictotheindex_hashsubcase:*///index_hash情况uint32high_mask;/*masksforsortablepartofhashcode*/uint32low_mask;uint32max_buckets;/**ThesevariablesarespecifictotheDatumcase;theyaresetby*tuplesort_begin_datumandusedonlybytheDatumTupleroutines.*这些变量用于Datum,通过tuplesort_begin_datum设置,仅用于DatumTuple例程.*/OiddatumType;/*weneedtypeleninordertoknowhowtocopytheDatums.*///需要typelen用于知道如何拷贝Datums.intdatumTypeLen;/**Resourcesnapshotfortimeofsortstart.*在排序开始时的资源快照*/#ifdefTRACE_SORTPGRUsageru_start;#endif};

到此，相信大家对“PostgreSQL中的Tuplesortstate数据结构是怎样的”有了更深的了解，不妨来实际操作一番吧！这里是亿速云网站，更多相关内容可以进入相关频道进行查询，关注我们，继续学习！