PostgreSQL的后台进程checkpointer分析
本篇内容介绍了“PostgreSQL的后台进程checkpointer分析”的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!
一、数据结构CheckPoint
CheckPoint XLOG record结构体.
/**BodyofCheckPointXLOGrecords.Thisisdeclaredherebecausewekeep*acopyofthelatestoneinpg_controlforpossibledisasterrecovery.*ChangingthisstructrequiresaPG_CONTROL_VERSIONbump.*CheckPointXLOGrecord结构体.*在这里声明是因为我们在pg_control中保存了最新的副本,*以便进行可能的灾难恢复。*改变这个结构体需要一个PG_CONTROL_VERSIONbump。*/typedefstructCheckPoint{//在开始创建CheckPoint时下一个可用的RecPtr(比如REDO的开始点)XLogRecPtrredo;/*nextRecPtravailablewhenwebeganto*createCheckPoint(i.e.REDOstartpoint)*///当前的时间线TimeLineIDThisTimeLineID;/*currentTLI*///上一个时间线(如该记录正在开启一条新的时间线,否则等于当前时间线)TimeLineIDPrevTimeLineID;/*previousTLI,ifthisrecordbeginsanew*timeline(equalsThisTimeLineIDotherwise)*///是否full-page-writeboolfullPageWrites;/*currentfull_page_writes*///nextXid的高阶位uint32nextXidEpoch;/*higher-orderbitsofnextXid*///下一个free的XIDTransactionIdnextXid;/*nextfreeXID*///下一个free的OIDOidnextOid;/*nextfreeOID*///下一个fredd的MultiXactIdMultiXactIdnextMulti;/*nextfreeMultiXactId*///下一个空闲的MultiXact偏移MultiXactOffsetnextMultiOffset;/*nextfreeMultiXactoffset*///集群范围内的最小datfrozenxidTransactionIdoldestXid;/*cluster-wideminimumdatfrozenxid*///最小datfrozenxid所在的databaseOidoldestXidDB;/*databasewithminimumdatfrozenxid*///集群范围内的最小datminmxidMultiXactIdoldestMulti;/*cluster-wideminimumdatminmxid*///最小datminmxid所在的databaseOidoldestMultiDB;/*databasewithminimumdatminmxid*///checkpoint的时间戳pg_time_ttime;/*timestampofcheckpoint*///带有有效提交时间戳的最老XidTransactionIdoldestCommitTsXid;/*oldestXidwithvalidcommit*timestamp*///带有有效提交时间戳的最新XidTransactionIdnewestCommitTsXid;/*newestXidwithvalidcommit*timestamp*//**OldestXIDstillrunning.Thisisonlyneededtoinitializehotstandby*modefromanonlinecheckpoint,soweonlybothercalculatingthisfor*onlinecheckpointsandonlywhenwal_levelisreplica.Otherwiseit's*settoInvalidTransactionId.*最老的XID还在运行。*这只需要从onlinecheckpoint初始化热备模式,因此我们只需要为在线检查点计算此值,*并且只在wal_level是replica时才计算此值。*否则它被设置为InvalidTransactionId。*/TransactionIdoldestActiveXid;}CheckPoint;/*XLOGinfovaluesforXLOGrmgr*/#defineXLOG_CHECKPOINT_SHUTDOWN0x00#defineXLOG_CHECKPOINT_ONLINE0x10#defineXLOG_NOOP0x20#defineXLOG_NEXTOID0x30#defineXLOG_SWITCH0x40#defineXLOG_BACKUP_END0x50#defineXLOG_PARAMETER_CHANGE0x60#defineXLOG_RESTORE_POINT0x70#defineXLOG_FPW_CHANGE0x80#defineXLOG_END_OF_RECOVERY0x90#defineXLOG_FPI_FOR_HINT0xA0#defineXLOG_FPI0xB0
CheckpointerShmem
checkpointer进程和其他后台进程之间通讯的共享内存结构.
/*----------*Sharedmemoryareaforcommunicationbetweencheckpointerandbackends*checkpointer进程和其他后台进程之间通讯的共享内存结构.**Theckptcountersallowbackendstowatchforcompletionofacheckpoint*requesttheysend.Here'showitworks:**Atstartofacheckpoint,checkpointerreads(andclears)therequest*flagsandincrementsckpt_started,whileholdingckpt_lck.**Oncompletionofacheckpoint,checkpointersetsckpt_doneto*equalckpt_started.**Onfailureofacheckpoint,checkpointerincrementsckpt_failed*andsetsckpt_donetoequalckpt_started.*ckpt计数器可以让后台进程监控它们发出来的checkpoint请求是否已完成.其工作原理如下:**在checkpoint启动阶段,checkpointer进程获取并持有ckpt_lck锁后,*读取(并清除)请求标志并增加ckpt_started计数.**checkpoint成功完成时,checkpointer设置ckpt_done值等于ckpt_started.**checkpoint如执行失败,checkpointer增加ckpt_failed计数,并设置ckpt_done值等于ckpt_started.**Thealgorithmforbackendsis:*1.Recordcurrentvaluesofckpt_failedandckpt_started,and*setrequestflags,whileholdingckpt_lck.*2.Sendsignaltorequestcheckpoint.*3.Sleepuntilckpt_startedchanges.Nowyouknowacheckpointhas*begunsinceyoustartedthisalgorithm(although*not*thatitwas*specificallyinitiatedbyyoursignal),andthatitisusingyourflags.*4.Recordnewvalueofckpt_started.*5.Sleepuntilckpt_done>=savedvalueofckpt_started.(Usemodulo*arithmetichereincasecounterswraparound.)Nowyouknowa*checkpointhasstartedandcompleted,butnotwhetheritwas*successful.*6.Ifckpt_failedisdifferentfromtheoriginallysavedvalue,*assumerequestfailed;otherwiseitwasdefinitelysuccessful.*算法如下:*1.获取并持有ckpt_lck锁后,记录ckpt_failed和ckpt_started的当前值,并设置请求标志.*2.发送信号,请求checkpoint.*3.休眠直至ckpt_started发生变化.*现在您知道自您启动此算法以来检查点已经开始(尽管*不是*它是由您的信号具体发起的),并且它正在使用您的标志。*4.记录ckpt_started的新值.*5.休眠,直至ckpt_done>=已保存的ckpt_started值(取模).现在已知checkpoint已启动&已完成,但checkpoint不一定成功.*6.如果ckpt_failed与原来保存的值不同,则可以认为请求失败,否则它肯定是成功的.**ckpt_flagsholdstheORofthecheckpointrequestflagssentbyall*requestingbackendssincethelastcheckpointstart.Theflagsare*chosensothatOR'ingisthecorrectwaytocombinemultiplerequests.*ckpt_flags保存自上次检查点启动以来所有后台进程发送的检查点请求标志的OR或标记。*选择标志,以便OR'ing是组合多个请求的正确方法。**num_backend_writesisusedtocountthenumberofbufferwritesperformed*byuserbackendprocesses.Thiscountershouldbewideenoughthatit*can'toverflowduringasingleprocessingcycle.num_backend_fsync*countsthesubsetofthosewritesthatalsohadtodotheirownfsync,*becausethecheckpointerfailedtoabsorbtheirrequest.*num_backend_writes用于计算用户后台进程写入的缓冲区个数.*在一个单独的处理过程中,该计数器必须足够大以防溢出.*num_backend_fsync计数那些必须执行fsync写操作的子集,*因为checkpointer进程未能接受它们的请求。**Therequestsarrayholdsfsyncrequestssentbybackendsandnotyet*absorbedbythecheckpointer.*请求数组存储后台进程发出的未被checkpointer进程拒绝的fsync请求.**Unlikethecheckpointfields,num_backend_writes,num_backend_fsync,and*therequestsfieldsareprotectedbyCheckpointerCommLock.*不同于checkpoint域,num_backend_writes/num_backend_fsync通过CheckpointerCommLock保护.**----------*/typedefstruct{RelFileNodernode;//表空间/数据库/Relation信息ForkNumberforknum;//fork编号BlockNumbersegno;/*seemd.cforspecialvalues*//*mightaddarealrequest-typefieldlater;notneededyet*/}CheckpointerRequest;typedefstruct{//checkpoint进程的pid(为0则进程未启动)pid_tcheckpointer_pid;/*PID(0ifnotstarted)*///用于保护所有的ckpt_*域slock_tckpt_lck;/*protectsalltheckpt_*fields*///在checkpoint启动时计数intckpt_started;/*advanceswhencheckpointstarts*///在checkpoint完成时计数intckpt_done;/*advanceswhencheckpointdone*///在checkpoint失败时计数intckpt_failed;/*advanceswhencheckpointfails*///检查点标记,在xlog.h中定义intckpt_flags;/*checkpointflags,asdefinedinxlog.h*///计数后台进程缓存写的次数uint32num_backend_writes;/*countsuserbackendbufferwrites*///计数后台进程fsync调用次数uint32num_backend_fsync;/*countsuserbackendfsynccalls*///当前的请求编号intnum_requests;/*current#ofrequests*///最大的请求编号intmax_requests;/*allocatedarraysize*///请求数组CheckpointerRequestrequests[FLEXIBLE_ARRAY_MEMBER];}CheckpointerShmemStruct;//静态变量(CheckpointerShmemStruct结构体指针)staticCheckpointerShmemStruct*CheckpointerShmem;二、源码解读
CheckpointerMain函数是checkpointer进程的入口.
该函数首先为信号设置控制器(如熟悉Java OO开发,对这样的写法应不陌生),然后创建进程的内存上下文,接着进入循环(forever),在"合适"的时候执行checkpoint.
/**Mainentrypointforcheckpointerprocess*checkpointer进程的入口.**ThisisinvokedfromAuxiliaryProcessMain,whichhasalreadycreatedthe*basicexecutionenvironment,butnotenabledsignalsyet.*在AuxiliaryProcessMain中调用,已创建了基本的运行环境,但尚未启用信号.*/voidCheckpointerMain(void){sigjmp_buflocal_sigjmp_buf;MemoryContextcheckpointer_context;CheckpointerShmem->checkpointer_pid=MyProcPid;//为信号设置控制器(如熟悉JavaOO开发,对这样的写法应不陌生)/**Properlyacceptorignoresignalsthepostmastermightsendus*接收或忽略postmaster进程可能发给我们的信息.**Note:wedeliberatelyignoreSIGTERM,becauseduringastandardUnix*systemshutdowncycle,initwillSIGTERMallprocessesatonce.We*wanttowaitforthebackendstoexit,whereuponthepostmasterwill*tellusit'sokaytoshutdown(viaSIGUSR2).*注意:我们有意忽略SIGTERM,因为在标准的Unix系统关闭周期中,*init将同时SIGTERM所有进程。*我们希望等待后台进程退出,然后postmaster会通知checkpointer进程可以关闭(通过SIGUSR2)。*///设置标志,读取配置文件pqsignal(SIGHUP,ChkptSigHupHandler);/*setflagtoreadconfigfile*///请求checkpointpqsignal(SIGINT,ReqCheckpointHandler);/*requestcheckpoint*///忽略SIGTERMpqsignal(SIGTERM,SIG_IGN);/*ignoreSIGTERM*///宕机pqsignal(SIGQUIT,chkpt_quickdie);/*hardcrashtime*///忽略SIGALRM&SIGPIPEpqsignal(SIGALRM,SIG_IGN);pqsignal(SIGPIPE,SIG_IGN);pqsignal(SIGUSR1,chkpt_sigusr1_handler);//请求关闭pqsignal(SIGUSR2,ReqShutdownHandler);/*requestshutdown*//**Resetsomesignalsthatareacceptedbypostmasterbutnothere*重置某些postmaster接收而不是在这里的信号*/pqsignal(SIGCHLD,SIG_DFL);/*WeallowSIGQUIT(quickdie)atalltimes*///运行SIGQUIT信号sigdelset(&BlockSig,SIGQUIT);/**Initializesothatfirsttime-driveneventhappensatthecorrecttime.*初始化以便时间驱动的事件在正确的时间发生.*/last_checkpoint_time=last_xlog_switch_time=(pg_time_t)time(NULL);/**Createamemorycontextthatwewilldoallourworkin.Wedothisso*thatwecanresetthecontextduringerrorrecoveryandtherebyavoid*possiblememoryleaks.Formerlythiscodejustranin*TopMemoryContext,butresettingthatwouldbeareallybadidea.*创建进程的内存上下文.*之所以这样做是我们可以在出现异常执行恢复期间重置上下文以确保不会出现内存泄漏.**/checkpointer_context=AllocSetContextCreate(TopMemoryContext,"Checkpointer",ALLOCSET_DEFAULT_SIZES);MemoryContextSwitchTo(checkpointer_context);/**Ifanexceptionisencountered,processingresumeshere.*如出现异常,在这里处理恢复.**Seenotesinpostgres.caboutthedesignofthiscoding.*这部分的设计可参照postgres.c中的注释*/if(sigsetjmp(local_sigjmp_buf,1)!=0){/*SincenotusingPG_TRY,mustreseterrorstackbyhand*///没有使用PG_TRY,必须重置错误栈error_context_stack=NULL;/*Preventinterruptswhilecleaningup*///在清除期间必须避免中断HOLD_INTERRUPTS();/*Reporttheerrortotheserverlog*///在日志中报告错误信息EmitErrorReport();/**Theseoperationsarereallyjustaminimalsubsetof*AbortTransaction().Wedon'thaveverymanyresourcestoworry*aboutincheckpointer,butwedohaveLWLocks,buffers,andtemp*files.*这些操作实际上只是AbortTransaction()的最小集合.*我们不需要耗费太多的资源在checkpointer进程上,但需要持有LWLocks/缓存和临时文件*/LWLockReleaseAll();ConditionVariableCancelSleep();pgstat_report_wait_end();AbortBufferIO();UnlockBuffers();ReleaseAuxProcessResources(false);AtEOXact_Buffers(false);AtEOXact_SMgr();AtEOXact_Files(false);AtEOXact_HashTables(false);/*Warnanywaitingbackendsthatthecheckpointfailed.*///通知正在等待的后台进程:checkpoint执行失败if(ckpt_active){SpinLockAcquire(&CheckpointerShmem->ckpt_lck);CheckpointerShmem->ckpt_failed++;CheckpointerShmem->ckpt_done=CheckpointerShmem->ckpt_started;SpinLockRelease(&CheckpointerShmem->ckpt_lck);ckpt_active=false;}/**Nowreturntonormaltop-levelcontextandclearErrorContextfor*nexttime.*回到常规的顶层上下文,为下一次checkpoint清空ErrorContext*/MemoryContextSwitchTo(checkpointer_context);FlushErrorState();/*Flushanyleakeddatainthetop-levelcontext*///在顶层上下文刷新泄漏的数据MemoryContextResetAndDeleteChildren(checkpointer_context);/*Nowwecanallowinterruptsagain*///现在我们可以允许中断了RESUME_INTERRUPTS();/**Sleepatleast1secondafteranyerror.Awriteerrorislikely*toberepeated,andwedon'twanttobefillingtheerrorlogsas*fastaswecan.*出现错误后,至少休眠1s.*写入错误可能会重复出现,但我们不希望频繁出现错误日志,因此需要休眠1s.*/pg_usleep(1000000L);/**Closeallopenfilesafteranyerror.ThisishelpfulonWindows,*whereholdingdeletedfilesopencausesvariousstrangeerrors.*It'snotclearweneeditelsewhere,butshouldn'thurt.*出现错误后,关闭所有打开的文件句柄.*尤其在Windows平台,仍持有已删除的文件句柄会导致莫名其妙的错误.*目前还不清楚我们是否需要在其他地方使用它,但这不会导致其他额外的问题。*/smgrcloseall();}/*Wecannowhandleereport(ERROR)*///现在可以处理ereport(ERROR)调用了.PG_exception_stack=&local_sigjmp_buf;/**Unblocksignals(theywereblockedwhenthepostmasterforkedus)*解锁信号(在postmasterfork进程的时候,会阻塞信号)*/PG_SETMASK(&UnBlockSig);/**Ensureallsharedmemoryvaluesaresetcorrectlyfortheconfig.Doing*thishereensuresnoraceconditionsfromotherconcurrentupdaters.*确保所有的共享内存变量已正确配置.*在这里执行这样的检查确保不存在来自其他并发更新进程的竞争条件.*/UpdateSharedMemoryConfig();/**Advertiseourlatchthatbackendscanusetowakeusupwhilewe're*sleeping.*广播本进程的latch,在进程休眠时其他进程可以使用此latch唤醒.*/ProcGlobal->checkpointerLatch=&MyProc->procLatch;/**Loopforever*循环,循环,循环...*/for(;;){booldo_checkpoint=false;//是否执行checkpointintflags=0;//标记pg_time_tnow;//时间intelapsed_secs;//已消逝的时间intcur_timeout;//timeout时间/*Clearanyalready-pendingwakeups*/ResetLatch(MyLatch);/**Processanyrequestsorsignalsreceivedrecently.*处理最近接收到的请求或信号*/AbsorbFsyncRequests();if(got_SIGHUP)//{got_SIGHUP=false;ProcessConfigFile(PGC_SIGHUP);/**Checkpointeristhelastprocesstoshutdown,soweaskitto*holdthekeysforarangeofothertasksrequiredmostofwhich*havenothingtodowithcheckpointingatall.*Checkpointer是最后一个关闭的进程,因此我们要求它保存一些一系列其他任务需要的键值,*虽然其中大部分任务与检查点完全无关.**Forvariousreasons,someconfigvaluescanchangedynamically*sotheprimarycopyofthemisheldinsharedmemorytomake*sureallbackendsseethesamevalue.WemakeCheckpointer*responsibleforupdatingthesharedmemorycopyifthe*parametersettingchangesbecauseofSIGHUP.*由于各种原因,某些配置项可以动态修改,*因此这些配置项的拷贝在共享内存中存储以确保所有的后台进程看到的值是一样的.*如果参数设置是因为SIGHUP引起的,那么我们让Checkpointer进程负责更新共享内存中的配置项拷贝.*/UpdateSharedMemoryConfig();}if(checkpoint_requested){//接收到checkpoint请求checkpoint_requested=false;//重置标志do_checkpoint=true;//需要执行checkpointBgWriterStats.m_requested_checkpoints++;//计数}if(shutdown_requested){//接收到关闭请求/**Fromhereon,elog(ERROR)shouldendwithexit(1),notsend*controlbacktothesigsetjmpblockabove*从这里开始,日志(错误)应该以exit(1)结束,而不是将控制发送回上面的sigsetjmp块*/ExitOnAnyError=true;/*Closedownthedatabase*///关闭数据库ShutdownXLOG(0,0);/*Normalexitfromthecheckpointerishere*///checkpointer在这里正常退出proc_exit(0);/*done*/}/**Forceacheckpointiftoomuchtimehaselapsedsincethelastone.*Notethatwecountatimedcheckpointinstatsonlywhenthis*occurswithoutanexternalrequest,butwesettheCAUSE_TIMEflag*bitevenifthereisalsoanexternalrequest.*在上次checkpoint后,已超时,则执行checkpoint.*注意,只有在没有外部请求的情况下,我们才会在统计数据中计算定时检查点,*但计算出现;了外部请求,我们也会设置CAUSE_TIME标志位.*/now=(pg_time_t)time(NULL);//当前时间elapsed_secs=now-last_checkpoint_time;//已消逝的时间if(elapsed_secs>=CheckPointTimeout){//超时if(!do_checkpoint)BgWriterStats.m_timed_checkpoints++;//没有接收到checkpoint请求,进行统计do_checkpoint=true;//设置标记flags|=CHECKPOINT_CAUSE_TIME;//设置标记}/**Doacheckpointifrequested.*执行checkpoint*/if(do_checkpoint){boolckpt_performed=false;//设置标记booldo_restartpoint;/**Checkifweshouldperformacheckpointorarestartpoint.Asa*side-effect,RecoveryInProgress()initializesTimeLineIDif*it'snotsetyet.*检查我们是否需要执行checkpoint或restartpoint.*可能的其他影响是,如仍未设置TimeLineID,那么RecoveryInProgress()会初始化TimeLineID*/do_restartpoint=RecoveryInProgress();/**Atomicallyfetchtherequestflagstofigureoutwhatkindofa*checkpointweshouldperform,andincreasethestarted-counter*toacknowledgethatwe'vestartedanewcheckpoint.*自动提取请求标志,以决定那种checkpoint需要执行,同时增加开始计数已确认我们已启动了新的checkpoint.*/SpinLockAcquire(&CheckpointerShmem->ckpt_lck);flags|=CheckpointerShmem->ckpt_flags;CheckpointerShmem->ckpt_flags=0;CheckpointerShmem->ckpt_started++;SpinLockRelease(&CheckpointerShmem->ckpt_lck);/**Theend-of-recoverycheckpointisarealcheckpointthat's*performedwhilewe'restillinrecovery.*end-of-recoverycheckpoint是在数据库恢复过程中执行的checkpoint.*/if(flags&CHECKPOINT_END_OF_RECOVERY)do_restartpoint=false;/**Wewillwarnif(a)toosoonsincelastcheckpoint(whatever*causedit)and(b)somebodysettheCHECKPOINT_CAUSE_XLOGflag*sincethelastcheckpointstart.Noteinparticularthatthis*implementationwillnotgeneratewarningscausedby*CheckPointTimeout<CheckPointWarning.*如果checkpoint发生的太频繁(不管是什么原因)*或者在上次checkpoint启动后某个进程设置了CHECKPOINT_CAUSE_XLOG标志,*我们都会发出警告.*请特别注意,此实现不会生成由CheckPointTimeout<CheckPointWarning引起的警告。*/if(!do_restartpoint&&(flags&CHECKPOINT_CAUSE_XLOG)&&elapsed_secs<CheckPointWarning)ereport(LOG,(errmsg_plural("checkpointsareoccurringtoofrequently(%dsecondapart)","checkpointsareoccurringtoofrequently(%dsecondsapart)",elapsed_secs,elapsed_secs),errhint("Considerincreasingtheconfigurationparameter\"max_wal_size\".")));/**Initializecheckpointer-privatevariablesusedduring*checkpoint.*初始化checkpointer进程在checkpoint过程中需使用的私有变量*/ckpt_active=true;if(do_restartpoint)//执行restartpointckpt_start_recptr=GetXLogReplayRecPtr(NULL);//获取Redopintelse//执行checkpointckpt_start_recptr=GetInsertRecPtr();//获取checkpointXLOGRecord插入的位置ckpt_start_time=now;//开始时间ckpt_cached_elapsed=0;//消逝时间/**Dothecheckpoint.*执行checkpoint.*/if(!do_restartpoint){//执行checkpointCreateCheckPoint(flags);//创建checkpointckpt_performed=true;//DONE!}else//恢复过程的restartpointckpt_performed=CreateRestartPoint(flags);/**Afteranycheckpoint,closeallsmgrfiles.Thisissowe*won'thangontosmgrreferencestodeletedfilesindefinitely.*执行checkpoint完成后,关闭所有的smgr文件.*这样我们就不需要无限期的持有已删除文件的smgr引用.*/smgrcloseall();/**Indicatecheckpointcompletiontoanywaitingbackends.*通知等待的进程,checkpoint完成.*/SpinLockAcquire(&CheckpointerShmem->ckpt_lck);CheckpointerShmem->ckpt_done=CheckpointerShmem->ckpt_started;SpinLockRelease(&CheckpointerShmem->ckpt_lck);if(ckpt_performed){//已完成checkpoint/**Notewerecordthecheckpointstarttimenotendtimeas*last_checkpoint_time.Thisissothattime-driven*checkpointshappenatapredictablespacing.*注意我们记录了checkpoint的开始时间而不是结束时间作为last_checkpoint_time.*这样,时间驱动的检查点就会以可预测的间隔出现。*/last_checkpoint_time=now;}else{///**Wewerenotabletoperformtherestartpoint(checkpoints*throwanERRORincaseoferror).Mostlikelybecausewe*havenotreceivedanynewcheckpointWALrecordssincethe*lastrestartpoint.Tryagainin15s.*没有成功执行restartpoint(如果是checkpoint出现问题会直接报错,不会进入到这里).*最有可能的原因是因为在上次restartpoint后没有接收到新的checkpointWAL记录.*15s后尝试.*/last_checkpoint_time=now-CheckPointTimeout+15;}ckpt_active=false;}/*Checkforarchive_timeoutandswitchxlogfilesifnecessary.*///在需要的时候,检查archive_timeout并切换xlog文件.CheckArchiveTimeout();/**Sendoffactivitystatisticstothestatscollector.(Thereason*whywere-usebgwriter-relatedcodeforthisisthatthebgwriter*andcheckpointerusedtobejustoneprocess.It'sprobablynot*worththetroubletosplitthestatssupportintotwoindependent*statsmessagetypes.)*发送活动统计到统计收集器.*/pgstat_send_bgwriter();/**Sleepuntilwearesignaledorit'stimeforanothercheckpointor*xlogfileswitch.*休眠,直至接收到信号或者需要启动新的checkpoint或xlog文件切换.*///重置相关变量now=(pg_time_t)time(NULL);elapsed_secs=now-last_checkpoint_time;if(elapsed_secs>=CheckPointTimeout)continue;/*nosleepforus...*/cur_timeout=CheckPointTimeout-elapsed_secs;if(XLogArchiveTimeout>0&&!RecoveryInProgress()){elapsed_secs=now-last_xlog_switch_time;if(elapsed_secs>=XLogArchiveTimeout)continue;/*nosleepforus...*/cur_timeout=Min(cur_timeout,XLogArchiveTimeout-elapsed_secs);//获得最小休眠时间}(void)WaitLatch(MyLatch,WL_LATCH_SET|WL_TIMEOUT|WL_EXIT_ON_PM_DEATH,cur_timeout*1000L/*converttoms*/,WAIT_EVENT_CHECKPOINTER_MAIN);//休眠}}/**Unix-likesignalhandlerinstallation*Unix风格的信号处理器*Onlycalledonmainthread,nosyncrequired*只需要在主线程执行,不需要sync同步.*/pqsigfuncpqsignal(intsignum,pqsigfunchandler){pqsigfuncprevfunc;//函数if(signum>=PG_SIGNAL_COUNT||signum<0)returnSIG_ERR;//验证不通过,返回错误prevfunc=pg_signal_array[signum];//获取先前的处理函数pg_signal_array[signum]=handler;//注册函数returnprevfunc;//返回先前注册的函数}/**GetInsertRecPtr--Returnsthecurrentinsertposition.*返回当前插入位置**NOTE:Thevalue*actually*returnedisthepositionofthelastfull*xlogpage.Itlagsbehindtherealinsertpositionbyatmost1page.*Forthat,wedon'tneedtoscanthroughWALinsertionlocks,andan*approximationisenoughforthecurrentusageofthisfunction.*注意:返回的值*实际上*是最后一个完整xlog页面的位置.*它比实际插入位置最多落后1页。*为此,我们不需要遍历WAL插入锁,满足该函数的当前使用目的,近似值已足够。*/XLogRecPtrGetInsertRecPtr(void){XLogRecPtrrecptr;SpinLockAcquire(&XLogCtl->info_lck);recptr=XLogCtl->LogwrtRqst.Write;//获取插入位置SpinLockRelease(&XLogCtl->info_lck);returnrecptr;}三、跟踪分析
创建数据表,插入数据,执行checkpoint
testdb=#droptablet_wal_ckpt;DROPTABLEtestdb=#createtablet_wal_ckpt(c1intnotnull,c2varchar(40),c3varchar(40));CREATETABLEtestdb=#insertintot_wal_ckpt(c1,c2,c3)values(1,'C2-1','C3-1');INSERT01testdb=#testdb=#checkpoint;-->第一次checkpoint
更新数据,执行checkpoint.
testdb=#updatet_wal_ckptsetc2='C2#'||substr(c2,4,40);UPDATE1testdb=#checkpoint;
启动gdb,设置信号控制
(gdb)handleSIGINTprintnostoppassSIGINTisusedbythedebugger.Areyousureyouwanttochangeit?(yorn)ySignalStopPrintPasstoprogramDescriptionSIGINTNoYesYesInterrupt(gdb)(gdb)bcheckpointer.c:441Breakpoint1at0x815197:filecheckpointer.c,line441.(gdb)cContinuing.ProgramreceivedsignalSIGINT,Interrupt.Breakpoint1,CheckpointerMain()atcheckpointer.c:441441flags|=CheckpointerShmem->ckpt_flags;(gdb)
查看共享内存信息CheckpointerShmem
(gdb)p*CheckpointerShmem$1={checkpointer_pid=1650,ckpt_lck=1'\001',ckpt_started=2,ckpt_done=2,ckpt_failed=0,ckpt_flags=44,num_backend_writes=0,num_backend_fsync=0,num_requests=0,max_requests=65536,requests=0x7f2cdda07b28}(gdb)
设置相关信息CheckpointerShmem
441flags|=CheckpointerShmem->ckpt_flags;(gdb)n442CheckpointerShmem->ckpt_flags=0;(gdb)443CheckpointerShmem->ckpt_started++;(gdb)444SpinLockRelease(&CheckpointerShmem->ckpt_lck);(gdb)450if(flags&CHECKPOINT_END_OF_RECOVERY)(gdb)460if(!do_restartpoint&&(gdb)461(flags&CHECKPOINT_CAUSE_XLOG)&&(gdb)460if(!do_restartpoint&&
初始化checkpointer进程在checkpoint过程中需使用的私有变量.
其中ckpt_start_recptr为插入点,即Redo point,5521180544转换为16进制为0x1 49168780
(gdb)474ckpt_active=true;(gdb)475if(do_restartpoint)(gdb)478ckpt_start_recptr=GetInsertRecPtr();(gdb)pXLogCtl->LogwrtRqst$1={Write=5521180544,Flush=5521180544}(gdb)n479ckpt_start_time=now;(gdb)pckpt_start_recptr$2=5521180544(gdb)n480ckpt_cached_elapsed=0;(gdb)485if(!do_restartpoint)(gdb)
执行checkpoint.OK!
(gdb)487CreateCheckPoint(flags);(gdb)488ckpt_performed=true;(gdb)
关闭资源,并设置共享内存中的信息
497smgrcloseall();(gdb)502SpinLockAcquire(&CheckpointerShmem->ckpt_lck);(gdb)503CheckpointerShmem->ckpt_done=CheckpointerShmem->ckpt_started;(gdb)504SpinLockRelease(&CheckpointerShmem->ckpt_lck);(gdb)506if(ckpt_performed)(gdb)pCheckpointerShmem$3=(CheckpointerShmemStruct*)0x7fcecc063b00(gdb)p*CheckpointerShmem$4={checkpointer_pid=1697,ckpt_lck=0'\000',ckpt_started=1,ckpt_done=1,ckpt_failed=0,ckpt_flags=0,num_backend_writes=0,num_backend_fsync=0,num_requests=0,max_requests=65536,requests=0x7fcecc063b28}(gdb)
checkpoint请求已清空
(gdb)pCheckpointerShmem->requests[0]$5={rnode={spcNode=0,dbNode=0,relNode=0},forknum=MAIN_FORKNUM,segno=0}
在需要的时候,检查archive_timeout并切换xlog文件.
休眠,直至接收到信号或者需要启动新的checkpoint或xlog文件切换.
(gdb)n513last_checkpoint_time=now;(gdb)526ckpt_active=false;(gdb)530CheckArchiveTimeout();(gdb)539pgstat_send_bgwriter();(gdb)545now=(pg_time_t)time(NULL);(gdb)546elapsed_secs=now-last_checkpoint_time;(gdb)547if(elapsed_secs>=CheckPointTimeout)(gdb)pelapsed_secs$7=1044(gdb)pCheckPointTimeout$8=900(gdb)n548continue;/*nosleepforus...*/
已超时,执行新的checkpoint
(gdb)569}(gdb)352booldo_checkpoint=false;(gdb)353intflags=0;(gdb)n360ResetLatch(MyLatch);(gdb)365AbsorbFsyncRequests();(gdb)367if(got_SIGHUP)(gdb)385if(checkpoint_requested)(gdb)391if(shutdown_requested)(gdb)410now=(pg_time_t)time(NULL);(gdb)411elapsed_secs=now-last_checkpoint_time;(gdb)412if(elapsed_secs>=CheckPointTimeout)(gdb)pelapsed_secs$9=1131(gdb)n414if(!do_checkpoint)(gdb)415BgWriterStats.m_timed_checkpoints++;(gdb)416do_checkpoint=true;(gdb)417flags|=CHECKPOINT_CAUSE_TIME;(gdb)423if(do_checkpoint)(gdb)425boolckpt_performed=false;(gdb)433do_restartpoint=RecoveryInProgress();(gdb)440SpinLockAcquire(&CheckpointerShmem->ckpt_lck);(gdb)Breakpoint1,CheckpointerMain()atcheckpointer.c:441441flags|=CheckpointerShmem->ckpt_flags;(gdb)442CheckpointerShmem->ckpt_flags=0;(gdb)443CheckpointerShmem->ckpt_started++;(gdb)cContinuing.
“PostgreSQL的后台进程checkpointer分析”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站,小编将为大家输出更多高质量的实用文章!
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。