PostgreSQL中StrategyGetBuffer函数有什么作用

发布时间：2021-11-09 15:57:52 来源：亿速云阅读：177 作者：iii 栏目：关系型数据库

本篇内容介绍了“PostgreSQL中StrategyGetBuffer函数有什么作用”的有关知识，在实际案例的操作过程中，不少人都会遇到这样的困境，接下来就让小编带领大家学习一下如何处理这些情况吧！希望大家仔细阅读，能够学有所成！

一、数据结构

BufferDesc
共享缓冲区的共享描述符(状态)数据

/*  * Flags for buffer descriptors  * buffer描述器标记  *  * Note: TAG_VALID essentially means that there is a buffer hashtable  * entry associated with the buffer's tag.  * 注意:TAG_VALID本质上意味着有一个与缓冲区的标记相关联的缓冲区散列表条目。  */ //buffer header锁定 #define BM_LOCKED               (1U << 22)  /* buffer header is locked */ //数据需要写入(标记为DIRTY) #define BM_DIRTY                (1U << 23)  /* data needs writing */ //数据是有效的 #define BM_VALID                (1U << 24)  /* data is valid */ //已分配buffer tag #define BM_TAG_VALID            (1U << 25)  /* tag is assigned */ //正在R/W #define BM_IO_IN_PROGRESS       (1U << 26)  /* read or write in progress */ //上一个I/O出现错误 #define BM_IO_ERROR             (1U << 27)  /* previous I/O failed */ //开始写则变DIRTY #define BM_JUST_DIRTIED         (1U << 28)  /* dirtied since write started */ //存在等待sole pin的其他进程 #define BM_PIN_COUNT_WAITER     (1U << 29)  /* have waiter for sole pin */ //checkpoint发生,必须刷到磁盘上 #define BM_CHECKPOINT_NEEDED    (1U << 30)  /* must write for checkpoint */ //持久化buffer(不是unlogged或者初始化fork) #define BM_PERMANENT            (1U << 31)  /* permanent buffer (not unlogged,                                              * or init fork) */ /*  *  BufferDesc -- shared descriptor/state data for a single shared buffer.  *  BufferDesc -- 共享缓冲区的共享描述符(状态)数据  *  * Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change  * the tag, state or wait_backend_pid fields.  In general, buffer header lock  * is a spinlock which is combined with flags, refcount and usagecount into  * single atomic variable.  This layout allow us to do some operations in a  * single atomic operation, without actually acquiring and releasing spinlock;  * for instance, increase or decrease refcount.  buf_id field never changes  * after initialization, so does not need locking.  freeNext is protected by  * the buffer_strategy_lock not buffer header lock.  The LWLock can take care  * of itself.  The buffer header lock is *not* used to control access to the  * data in the buffer!  * 注意:必须持有Buffer header锁(BM_LOCKED标记)才能检查或修改tag/state/wait_backend_pid字段.  * 通常来说,buffer header lock是spinlock,它与标记位/参考计数/使用计数组合到单个原子变量中.  * 这个布局设计允许我们执行原子操作,而不需要实际获得或者释放spinlock(比如,增加或者减少参考计数).  * buf_id字段在初始化后不会出现变化,因此不需要锁定.  * freeNext通过buffer_strategy_lock锁而不是buffer header lock保护.  * LWLock可以很好的处理自己的状态.  * 务请注意的是:buffer header lock不用于控制buffer中的数据访问!  *  * It's assumed that nobody changes the state field while buffer header lock  * is held.  Thus buffer header lock holder can do complex updates of the  * state variable in single write, simultaneously with lock release (cleaning  * BM_LOCKED flag).  On the other hand, updating of state without holding  * buffer header lock is restricted to CAS, which insure that BM_LOCKED flag  * is not set.  Atomic increment/decrement, OR/AND etc. are not allowed.  * 假定在持有buffer header lock的情况下,没有人改变状态字段.  * 持有buffer header lock的进程可以执行在单个写操作中执行复杂的状态变量更新,  *   同步的释放锁(清除BM_LOCKED标记).  * 换句话说,如果没有持有buffer header lock的状态更新,会受限于CAS,  *   这种情况下确保BM_LOCKED没有被设置.  * 比如原子的增加/减少(AND/OR)等操作是不允许的.  *  * An exception is that if we have the buffer pinned, its tag can't change  * underneath us, so we can examine the tag without locking the buffer header.  * Also, in places we do one-time reads of the flags without bothering to  * lock the buffer header; this is generally for situations where we don't  * expect the flag bit being tested to be changing.  * 一种例外情况是如果我们已有buffer pinned,该buffer的tag不能改变(在本进程之下),  *   因此不需要锁定buffer header就可以检查tag了.  * 同时,在执行一次性的flags读取时不需要锁定buffer header.  * 这种情况通常用于我们不希望正在测试的flag bit将被改变.  *  * We can't physically remove items from a disk page if another backend has  * the buffer pinned.  Hence, a backend may need to wait for all other pins  * to go away.  This is signaled by storing its own PID into  * wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER.  At present,  * there can be only one such waiter per buffer.  * 如果其他进程有buffer pinned,那么进程不能物理的从磁盘页面中删除items.  * 因此,后台进程需要等待其他pins清除.这可以通过存储它自己的PID到wait_backend_pid中,  *   并设置标记位BM_PIN_COUNT_WAITER.  * 目前,每个缓冲区只能由一个等待进程.  *  * We use this same struct for local buffer headers, but the locks are not  * used and not all of the flag bits are useful either. To avoid unnecessary  * overhead, manipulations of the state field should be done without actual  * atomic operations (i.e. only pg_atomic_read_u32() and  * pg_atomic_unlocked_write_u32()).  * 本地缓冲头部使用同样的结构,但并不需要使用locks,而且并不是所有的标记位都使用.  * 为了避免不必要的负载,状态域的维护不需要实际的原子操作  * (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())  *  * Be careful to avoid increasing the size of the struct when adding or  * reordering members.  Keeping it below 64 bytes (the most common CPU  * cache line size) is fairly important for performance.  * 在增加或者记录成员变量时,小心避免增加结构体的大小.  * 保持结构体大小在64字节内(通常的CPU缓存线大小)对于性能是非常重要的.  */ typedef struct BufferDesc {     //buffer tag     BufferTag   tag;            /* ID of page contained in buffer */     //buffer索引编号(0开始)     int         buf_id;         /* buffer's index number (from 0) */     /* state of the tag, containing flags, refcount and usagecount */     //tag状态,包括flags/refcount和usagecount     pg_atomic_uint32 state;     //pin-count等待进程ID     int         wait_backend_pid;   /* backend PID of pin-count waiter */     //空闲链表链中下一个空闲的buffer     int         freeNext;       /* link in freelist chain */     //缓冲区内容锁     LWLock      content_lock;   /* to lock access to buffer contents */ } BufferDesc;

BufferTag
Buffer tag标记了buffer存储的是磁盘中哪个block

/*  * Buffer tag identifies which disk block the buffer contains.  * Buffer tag标记了buffer存储的是磁盘中哪个block  *  * Note: the BufferTag data must be sufficient to determine where to write the  * block, without reference to pg_class or pg_tablespace entries.  It's  * possible that the backend flushing the buffer doesn't even believe the  * relation is visible yet (its xact may have started before the xact that  * created the rel).  The storage manager must be able to cope anyway.  * 注意:BufferTag必须足以确定如何写block而不需要参照pg_class或者pg_tablespace数据字典信息.  * 有可能后台进程在刷新缓冲区的时候深圳不相信关系是可见的(事务可能在创建rel的事务之前).  * 存储管理器必须可以处理这些事情.  *  * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have  * to be fixed to zero them, since this struct is used as a hash key.  * 注意:如果在结构体中有填充的字节,INIT_BUFFERTAG必须将它们固定为零，因为这个结构体用作散列键.  */ typedef struct buftag {     //物理relation标识符     RelFileNode rnode;          /* physical relation identifier */     ForkNumber  forkNum;     //相对于relation起始的块号     BlockNumber blockNum;       /* blknum relative to begin of reln */ } BufferTag;

SMgrRelation
smgr.c维护一个包含SMgrRelation对象的hash表,SMgrRelation对象本质上是缓存的文件句柄.

/*  * smgr.c maintains a table of SMgrRelation objects, which are essentially  * cached file handles.  An SMgrRelation is created (if not already present)  * by smgropen(), and destroyed by smgrclose().  Note that neither of these  * operations imply I/O, they just create or destroy a hashtable entry.  * (But smgrclose() may release associated resources, such as OS-level file  * descriptors.)  * smgr.c维护一个包含SMgrRelation对象的hash表,SMgrRelation对象本质上是缓存的文件句柄.  * SMgrRelation对象(如非现成)通过smgropen()方法创建,通过smgrclose()方法销毁.  * 注意:这些操作都不会执行I/O操作,只会创建或者销毁哈希表条目.  * (但是smgrclose()方法可能会释放相关的资源,比如OS基本的文件描述符)  *  * An SMgrRelation may have an "owner", which is just a pointer to it from  * somewhere else; smgr.c will clear this pointer if the SMgrRelation is  * closed.  We use this to avoid dangling pointers from relcache to smgr  * without having to make the smgr explicitly aware of relcache.  There  * can't be more than one "owner" pointer per SMgrRelation, but that's  * all we need.  * SMgrRelation可能会有"宿主",这个宿主可能只是从某个地方指向它的指针而已;  * 如SMgrRelationsmgr.c会清除该指针.这样做可以避免从relcache到smgr的悬空指针,  *   而不必要让smgr显式的感知relcache(也就是隔离了smgr了relcache).  * 每个SMgrRelation不能跟多个"owner"指针关联,但这就是我们所需要的.  *  * SMgrRelations that do not have an "owner" are considered to be transient,  * and are deleted at end of transaction.  * SMgrRelations如无owner指针,则被视为临时对象,在事务的最后被删除.   */ typedef struct SMgrRelationData {     /* rnode is the hashtable lookup key, so it must be first! */     //-------- rnode是哈希表的搜索键,因此在结构体的首位     //关系物理定义ID     RelFileNodeBackend smgr_rnode;  /* relation physical identifier */     /* pointer to owning pointer, or NULL if none */     //--------- 指向拥有的指针,如无则为NULL     struct SMgrRelationData **smgr_owner;     /*      * These next three fields are not actually used or manipulated by smgr,      * except that they are reset to InvalidBlockNumber upon a cache flush      * event (in particular, upon truncation of the relation).  Higher levels      * store cached state here so that it will be reset when truncation      * happens.  In all three cases, InvalidBlockNumber means "unknown".      * 接下来的3个字段实际上并不用于或者由smgr管理,      *   除非这些表里在cache flush event发生时被重置为InvalidBlockNumber      *   (特别是在关系被截断时).      * 在这里,更高层的存储缓存了状态因此在截断发生时会被重置.      * 在这3种情况下,InvalidBlockNumber都意味着"unknown".      */     //当前插入的目标bloc     BlockNumber smgr_targblock; /* current insertion target block */     //最后已知的fsm fork大小     BlockNumber smgr_fsm_nblocks;   /* last known size of fsm fork */     //最后已知的vm fork大小     BlockNumber smgr_vm_nblocks;    /* last known size of vm fork */     /* additional public fields may someday exist here */     //------- 未来可能新增的公共域     /*      * Fields below here are intended to be private to smgr.c and its      * submodules.  Do not touch them from elsewhere.      * 下面的字段是smgr.c及其子模块私有的,不要从其他模块接触这些字段.      */     //存储管理器选择器     int         smgr_which;     /* storage manager selector */     /*      * for md.c; per-fork arrays of the number of open segments      * (md_num_open_segs) and the segments themselves (md_seg_fds).      * 用于md.c,打开段(md_num_open_segs)和段自身(md_seg_fds)的数组(每个fork一个)      */     int         md_num_open_segs[MAX_FORKNUM + 1];     struct _MdfdVec *md_seg_fds[MAX_FORKNUM + 1];     /* if unowned, list link in list of all unowned SMgrRelations */     //如没有宿主,未宿主的SMgrRelations链表的链表链接.     struct SMgrRelationData *next_unowned_reln; } SMgrRelationData; typedef SMgrRelationData *SMgrRelation;

RelFileNodeBackend
组合relfilenode和后台进程ID,用于提供需要定位物理存储的所有信息.

/*  * Augmenting a relfilenode with the backend ID provides all the information  * we need to locate the physical storage.  The backend ID is InvalidBackendId  * for regular relations (those accessible to more than one backend), or the  * owning backend's ID for backend-local relations.  Backend-local relations  * are always transient and removed in case of a database crash; they are  * never WAL-logged or fsync'd.  * 组合relfilenode和后台进程ID,用于提供需要定位物理存储的所有信息.  * 对于普通的关系(可通过多个后台进程访问),后台进程ID是InvalidBackendId;  * 如为临时表,则为自己的后台进程ID.  * 临时表(backend-local relations)通常是临时存在的,在数据库崩溃时删除,无需WAL-logged或者fsync.  */ typedef struct RelFileNodeBackend {     RelFileNode node;//节点     BackendId   backend;//后台进程 } RelFileNodeBackend;

StrategyControl
共享的空闲链表控制信息

/*  * The shared freelist control information.  * 共享的空闲链表控制信息.  */ typedef struct {     /* Spinlock: protects the values below */     //自旋锁,用于保护下面的值     slock_t     buffer_strategy_lock;     /*      * Clock sweep hand: index of next buffer to consider grabbing. Note that      * this isn't a concrete buffer - we only ever increase the value. So, to      * get an actual buffer, it needs to be used modulo NBuffers.      * Clock sweep hand:下一个考虑交换出去的buffer索引.      * 注意这并不是一个精确的buffer - 我们只是曾经增加值而已.      * 因此,获得一个实际的buffer,需要取模(使用NBuffers).      */     pg_atomic_uint32 nextVictimBuffer;     //未使用的buffers链表头部     int         firstFreeBuffer;    /* Head of list of unused buffers */     //未使用的buffers链表尾部     int         lastFreeBuffer; /* Tail of list of unused buffers */     /*      * NOTE: lastFreeBuffer is undefined when firstFreeBuffer is -1 (that is,      * when the list is empty)      * 注意:如firstFreeBuffer是-1,则lastFreeBuffer是未定义的.      * (这意味着,当链表是空的时候会出现这种情况)      */     /*      * Statistics.  These counters should be wide enough that they can't      * overflow during a single bgwriter cycle.      * 统计信息.这些计数器需要足够大,以确保在单个bgwriter循环时不会溢出.      */     //完成一轮clock sweep循环,进行计数     uint32      completePasses; /* Complete cycles of the clock sweep */     //自上次重置后分配的buffers     pg_atomic_uint32 numBufferAllocs;   /* Buffers allocated since last reset */     /*      * Bgworker process to be notified upon activity or -1 if none. See      * StrategyNotifyBgWriter.      * 活动时通知Bgworker进程,否则该值为-1.详细参见StrategyNotifyBgWriter.      */     int         bgwprocno; } BufferStrategyControl; /* Pointers to shared state */ //指向BufferStrategyControl结构体的指针 static BufferStrategyControl *StrategyControl = NULL;

二、源码解读

StrategyGetBuffer在BufferAlloc()中,由bufmgr调用,用于获得下一个候选的buffer.
其主要的处理逻辑如下:
1.初始化相关变量
2.如策略对象不为空,则从环形缓冲区中获取buffer,如成功则返回buf
3.如需要,则唤醒后台进程bgwriter,从共享内存中读取一次,然后根据该值设置latch
4.计算buffer分配请求,这样bgwriter可以估算buffer消耗的比例.
5.检查freelist中是否存在buffer
5.1如存在,则执行相关判断逻辑,如成功,则返回buf
5.2如不存在
5.2.1则使用clock sweep算法,选择buffer,执行相关判断,如成功,则返回buf
5.2.2如无法获取,在尝试过trycounter次后,报错

/*  * StrategyGetBuffer  *  *  Called by the bufmgr to get the next candidate buffer to use in  *  BufferAlloc(). The only hard requirement BufferAlloc() has is that  *  the selected buffer must not currently be pinned by anyone.  *  在BufferAlloc()中,由bufmgr调用,用于获得下一个候选的buffer.  *  BufferAlloc()中唯一稍微困难的需求是选择的buffer不能被其他后台进程pinned.  *  *  strategy is a BufferAccessStrategy object, or NULL for default strategy.  *  strategy是BufferAccessStrategy对象,如为默认策略,则为NULL.  *  *  To ensure that no one else can pin the buffer before we do, we must  *  return the buffer with the buffer header spinlock still held.  *  为了确保没有其他后台进程在我们完成之前pin buffer,必须返回仍持有buffer header自旋锁的buffer.  */ BufferDesc * StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state) {     BufferDesc *buf;//buffer描述符     int         bgwprocno;     int         trycounter;//尝试次数     //避免重复的依赖和解依赖     uint32      local_buf_state;    /* to avoid repeated (de-)referencing */     /*      * If given a strategy object, see whether it can select a buffer. We      * assume strategy objects don't need buffer_strategy_lock.      * 如果给定了一个策略对象,看看是否可以选择一个buffer.      * 我们假定策略对象不需要buffer_strategy_lock锁.      */     if (strategy != NULL)     {         //从环形缓冲区中获取buffer,如获取成功,则返回该buffer         buf = GetBufferFromRing(strategy, buf_state);         if (buf != NULL)             return buf;     }     /*      * If asked, we need to waken the bgwriter. Since we don't want to rely on      * a spinlock for this we force a read from shared memory once, and then      * set the latch based on that value. We need to go through that length      * because otherwise bgprocno might be reset while/after we check because      * the compiler might just reread from memory.      * 如需要,则唤醒后台进程bgwriter.      * 我们不希望依赖自旋锁实现这一点,所以强制从共享内存中读取一次,然后根据该值设置latch.      * 我们需要走完这一步,否则的话bgprocno在检查期间或之后被重置,因为编译器可能重新从内存中读取数据.      *      * This can possibly set the latch of the wrong process if the bgwriter      * dies in the wrong moment. But since PGPROC->procLatch is never      * deallocated the worst consequence of that is that we set the latch of      * some arbitrary process.      * 如果bgwriter出现异常宕机,可能会出现latch被设置为错误的进程.      * 但是由于PGPROC->procLatch从来没有被释放过，最坏的结果是我们设置了一些任意进程的latch。      */     bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno);     if (bgwprocno != -1)     {         //--- 如bgwprocno不是-1         /* reset bgwprocno first, before setting the latch */         //在设置latch前,首先重置bgwprocno为-1         StrategyControl->bgwprocno = -1;         /*          * Not acquiring ProcArrayLock here which is slightly icky. It's          * actually fine because procLatch isn't ever freed, so we just can          * potentially set the wrong process' (or no process') latch.          * 在这里不需要请求"令人生厌"的ProcArrayLock.          * 因为procLatch未曾释放过,因此实际上是没有问题的,          *   所以我们可能会设置错误的进程(或没有进程)latch。          */         SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);     }     /*      * We count buffer allocation requests so that the bgwriter can estimate      * the rate of buffer consumption.  Note that buffers recycled by a      * strategy object are intentionally not counted here.      * 计算buffer分配请求,这样bgwriter可以估算buffer消耗的比例.      * 注意通过策略对象进行的buffer回收不会在这里计算.      */     pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);     /*      * First check, without acquiring the lock, whether there's buffers in the      * freelist. Since we otherwise don't require the spinlock in every      * StrategyGetBuffer() invocation, it'd be sad to acquire it here -      * uselessly in most cases. That obviously leaves a race where a buffer is      * put on the freelist but we don't see the store yet - but that's pretty      * harmless, it'll just get used during the next buffer acquisition.      * 不需要请求锁,首次检查在freelist中是否存在buffer.      * 因为我们不需要在每次StrategyGetBuffer()调用时都使用自旋锁,      *   在这里请求自旋锁有点郁闷 -- 因为大多数情况下都没有用.      * 这显然存在一个竞争,其中缓冲区被放在空闲列表中,但进程却看不到存储      *   -- 但这是无害的,在下次buffer申请期间使用.        *      * If there's buffers on the freelist, acquire the spinlock to pop one      * buffer of the freelist. Then check whether that buffer is usable and      * repeat if not.      * 如果在空闲列表中有buffer存在,请求自旋锁,从空闲列表中弹出一个可用的buffer.      * 然后检查该buffer是否可用,如不可用则继续处理.      *      * Note that the freeNext fields are considered to be protected by the      * buffer_strategy_lock not the individual buffer spinlocks, so it's OK to      * manipulate them without holding the spinlock.      * 注意freeNext字段通过buffer_strategy_lock锁来保护,而不是使用独立的缓冲区自旋锁保护,      *   因此不需要持有自旋锁就可以维护这些字段.      */     if (StrategyControl->firstFreeBuffer >= 0)     {         while (true)         {             /* Acquire the spinlock to remove element from the freelist */             //请求自旋锁,删除空闲链表中的元素             SpinLockAcquire(&StrategyControl->buffer_strategy_lock);             if (StrategyControl->firstFreeBuffer < 0)             {                 //如无空闲空间,则马上跳出循环                 SpinLockRelease(&StrategyControl->buffer_strategy_lock);                 break;             }             //获取缓冲描述符             buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer);             Assert(buf->freeNext != FREENEXT_NOT_IN_LIST);             /* Unconditionally remove buffer from freelist */             //无条件的清除空闲链表中的buffer             StrategyControl->firstFreeBuffer = buf->freeNext;             buf->freeNext = FREENEXT_NOT_IN_LIST;             /*              * Release the lock so someone else can access the freelist while              * we check out this buffer.              * 释放锁,这样其他进程在我们检查该缓冲的时候可以访问空闲链表              */             SpinLockRelease(&StrategyControl->buffer_strategy_lock);             /*              * If the buffer is pinned or has a nonzero usage_count, we cannot              * use it; discard it and retry.  (This can only happen if VACUUM              * put a valid buffer in the freelist and then someone else used              * it before we got to it.  It's probably impossible altogether as              * of 8.3, but we'd better check anyway.)              * 如果缓冲pinned或者usage_count非0,则不能使用该buffer,丢弃并重试.              * (这种情况发生在VACUUM把一个有效的buffer放在空闲链表中,然后其他进程提前获得了这个buffer.              *  在8.3中是完全不可能的,但最好执行该检查)              */             //锁定缓冲头部             local_buf_state = LockBufHdr(buf);             if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0                 && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0)             {                 //refcount == 0 && usagecount == 0                 if (strategy != NULL)                     //非默认策略,则添加到环形缓冲区中                     AddBufferToRing(strategy, buf);                 //设置输出参数                 *buf_state = local_buf_state;                 //返回buf                 return buf;             }             //不满足条件,解锁buffer header             UnlockBufHdr(buf, local_buf_state);         }     }     /* Nothing on the freelist, so run the "clock sweep" algorithm */     //空闲链表中找不到或者满足不了条件,则执行"clock sweep"算法     //int NBuffers = 1000;     trycounter = NBuffers;//尝试次数     for (;;)     {         //------- 循环         //获取buffer描述符         buf = GetBufferDescriptor(ClockSweepTick());         /*          * If the buffer is pinned or has a nonzero usage_count, we cannot use          * it; decrement the usage_count (unless pinned) and keep scanning.          * 如果buffer已pinned,或者有一个非零值的usage_count,不能使用这个buffer.          * 减少usage_count(除非已pinned)继续扫描.          */         local_buf_state = LockBufHdr(buf);         if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0)         {             //----- refcount == 0             if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0)             {                 //usage_count <> 0                 //usage_count - 1                 local_buf_state -= BUF_USAGECOUNT_ONE;                 //重置尝试次数                 trycounter = NBuffers;             }             else             {                 //usage_count = 0                 /* Found a usable buffer */                 //发现一个可用的buffer                 if (strategy != NULL)                     //添加到该策略的环形缓冲区中                     AddBufferToRing(strategy, buf);                 //输出参数赋值                 *buf_state = local_buf_state;                 //返回buf                 return buf;             }         }         else if (--trycounter == 0)         {             //----- refcount <> 0 && --trycounter == 0             /*              * We've scanned all the buffers without making any state changes,              * so all the buffers are pinned (or were when we looked at them).              * We could hope that someone will free one eventually, but it's              * probably better to fail than to risk getting stuck in an              * infinite loop.              * 在没有改变任何状态的情况,我们已经完成了所有buffers的遍历,              *   因此所有的buffers已pinned(或者在搜索的时候pinned).              * 我们希望某些进程会周期性的释放buffer,但如果实在拿不到,那报错总比傻傻的死循环要好.              */             UnlockBufHdr(buf, local_buf_state);             elog(ERROR, "no unpinned buffers available");         }         //解锁buffer header         UnlockBufHdr(buf, local_buf_state);     } }

三、跟踪分析

测试脚本,查询数据表:

10:01:54 (xdb@[local]:5432)testdb=# select * from t1 limit 10;

启动gdb,设置断点

(gdb)  Continuing. Breakpoint 1, StrategyGetBuffer (strategy=0x0, buf_state=0x7ffcc97fb4ec) at freelist.c:212 212     if (strategy != NULL) (gdb)

输入参数
strategy=NULL,策略对象,使用默认策略

(gdb) p *buf_state $1 = 0

1.初始化相关变量
2.如策略对象不为空,则从环形缓冲区中获取buffer,如成功则返回buf
3.如需要,则唤醒后台进程bgwriter,从共享内存中读取一次,然后根据该值设置latch

(gdb) n 231     bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno); (gdb)  232     if (bgwprocno != -1) (gdb)  235         StrategyControl->bgwprocno = -1; (gdb) p bgwprocno $2 = 112 (gdb) p StrategyControl $3 = (BufferStrategyControl *) 0x7f8607b21700 (gdb) p *StrategyControl $4 = {buffer_strategy_lock = 0 '\000', nextVictimBuffer = {value = 0}, firstFreeBuffer = 134, lastFreeBuffer = 65535,    completePasses = 0, numBufferAllocs = {value = 0}, bgwprocno = 112} (gdb) n 242         SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch); (gdb)

4.计算buffer分配请求,这样bgwriter可以估算buffer消耗的比例.

(gdb)  250     pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);

5.检查freelist中是否存在buffer

(gdb)  268     if (StrategyControl->firstFreeBuffer >= 0)

5.1如存在,则执行相关判断逻辑,如成功,则返回buf

(gdb) n 273             SpinLockAcquire(&StrategyControl->buffer_strategy_lock); (gdb)  275             if (StrategyControl->firstFreeBuffer < 0) (gdb)  281             buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer); (gdb)  282             Assert(buf->freeNext != FREENEXT_NOT_IN_LIST); (gdb) p *buf $5 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295},    buf_id = 134, state = {value = 0}, wait_backend_pid = 0, freeNext = 135, content_lock = {tranche = 54, state = {       value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}} (gdb) n 285             StrategyControl->firstFreeBuffer = buf->freeNext; (gdb)  286             buf->freeNext = FREENEXT_NOT_IN_LIST; (gdb)  292             SpinLockRelease(&StrategyControl->buffer_strategy_lock); (gdb)  301             local_buf_state = LockBufHdr(buf); (gdb)  302             if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0 (gdb)  303                 && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0) (gdb)  305                 if (strategy != NULL) (gdb)  307                 *buf_state = local_buf_state; (gdb)  308                 return buf; (gdb) p *buf_state $6 = 4194304 (gdb) p *buf $7 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295},    buf_id = 134, state = {value = 4194304}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {       value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}} (gdb)

返回结果,回到BufferAlloc

(gdb) n 358 } (gdb)  BufferAlloc (smgr=0x22a38a0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, strategy=0x0,      foundPtr=0x7ffcc97fb5c3) at bufmgr.c:1073 1073            Assert(BUF_STATE_GET_REFCOUNT(buf_state) == 0); (gdb)

“PostgreSQL中StrategyGetBuffer函数有什么作用”的内容就介绍到这里了，感谢大家的阅读。如果想了解更多行业相关的知识可以关注亿速云网站，小编将为大家输出更多高质量的实用文章！

向AI问一下细节

PostgreSQL中StrategyGetBuffer函数有什么作用

一、数据结构

二、源码解读

三、跟踪分析

猜你喜欢

最新资讯

相关推荐

相关标签