PostgreSQL Source Code git master
vacuumlazy.c
Go to the documentation of this file.
1/*-------------------------------------------------------------------------
2 *
3 * vacuumlazy.c
4 * Concurrent ("lazy") vacuuming.
5 *
6 * Heap relations are vacuumed in three main phases. In phase I, vacuum scans
7 * relation pages, pruning and freezing tuples and saving dead tuples' TIDs in
8 * a TID store. If that TID store fills up or vacuum finishes scanning the
9 * relation, it progresses to phase II: index vacuuming. Index vacuuming
10 * deletes the dead index entries referenced in the TID store. In phase III,
11 * vacuum scans the blocks of the relation referred to by the TIDs in the TID
12 * store and reaps the corresponding dead items, freeing that space for future
13 * tuples.
14 *
15 * If there are no indexes or index scanning is disabled, phase II may be
16 * skipped. If phase I identified very few dead index entries or if vacuum's
17 * failsafe mechanism has triggered (to avoid transaction ID wraparound),
18 * vacuum may skip phases II and III.
19 *
20 * If the TID store fills up in phase I, vacuum suspends phase I and proceeds
21 * to phases II and III, cleaning up the dead tuples referenced in the current
22 * TID store. This empties the TID store, allowing vacuum to resume phase I.
23 *
24 * In a way, the phases are more like states in a state machine, but they have
25 * been referred to colloquially as phases for so long that they are referred
26 * to as such here.
27 *
28 * Manually invoked VACUUMs may scan indexes during phase II in parallel. For
29 * more information on this, see the comment at the top of vacuumparallel.c.
30 *
31 * In between phases, vacuum updates the freespace map (every
32 * VACUUM_FSM_EVERY_PAGES).
33 *
34 * After completing all three phases, vacuum may truncate the relation if it
35 * has emptied pages at the end. Finally, vacuum updates relation statistics
36 * in pg_class and the cumulative statistics subsystem.
37 *
38 * Relation Scanning:
39 *
40 * Vacuum scans the heap relation, starting at the beginning and progressing
41 * to the end, skipping pages as permitted by their visibility status, vacuum
42 * options, and various other requirements.
43 *
44 * Vacuums are either aggressive or normal. Aggressive vacuums must scan every
45 * unfrozen tuple in order to advance relfrozenxid and avoid transaction ID
46 * wraparound. Normal vacuums may scan otherwise skippable pages for one of
47 * two reasons:
48 *
49 * When page skipping is not disabled, a normal vacuum may scan pages that are
50 * marked all-visible (and even all-frozen) in the visibility map if the range
51 * of skippable pages is below SKIP_PAGES_THRESHOLD. This is primarily for the
52 * benefit of kernel readahead (see comment in heap_vac_scan_next_block()).
53 *
54 * A normal vacuum may also scan skippable pages in an effort to freeze them
55 * and decrease the backlog of all-visible but not all-frozen pages that have
56 * to be processed by the next aggressive vacuum. These are referred to as
57 * eagerly scanned pages. Pages scanned due to SKIP_PAGES_THRESHOLD do not
58 * count as eagerly scanned pages.
59 *
60 * Eagerly scanned pages that are set all-frozen in the VM are successful
61 * eager freezes and those not set all-frozen in the VM are failed eager
62 * freezes.
63 *
64 * Because we want to amortize the overhead of freezing pages over multiple
65 * vacuums, normal vacuums cap the number of successful eager freezes to
66 * MAX_EAGER_FREEZE_SUCCESS_RATE of the number of all-visible but not
67 * all-frozen pages at the beginning of the vacuum. Since eagerly frozen pages
68 * may be unfrozen before the next aggressive vacuum, capping the number of
69 * successful eager freezes also caps the downside of eager freezing:
70 * potentially wasted work.
71 *
72 * Once the success cap has been hit, eager scanning is disabled for the
73 * remainder of the vacuum of the relation.
74 *
75 * Success is capped globally because we don't want to limit our successes if
76 * old data happens to be concentrated in a particular part of the table. This
77 * is especially likely to happen for append-mostly workloads where the oldest
78 * data is at the beginning of the unfrozen portion of the relation.
79 *
80 * On the assumption that different regions of the table are likely to contain
81 * similarly aged data, normal vacuums use a localized eager freeze failure
82 * cap. The failure count is reset for each region of the table -- comprised
83 * of EAGER_SCAN_REGION_SIZE blocks. In each region, we tolerate
84 * vacuum_max_eager_freeze_failure_rate of EAGER_SCAN_REGION_SIZE failures
85 * before suspending eager scanning until the end of the region.
86 * vacuum_max_eager_freeze_failure_rate is configurable both globally and per
87 * table.
88 *
89 * Aggressive vacuums must examine every unfrozen tuple and thus are not
90 * subject to any of the limits imposed by the eager scanning algorithm.
91 *
92 * Once vacuum has decided to scan a given block, it must read the block and
93 * obtain a cleanup lock to prune tuples on the page. A non-aggressive vacuum
94 * may choose to skip pruning and freezing if it cannot acquire a cleanup lock
95 * on the buffer right away. In this case, it may miss cleaning up dead tuples
96 * and their associated index entries (though it is free to reap any existing
97 * dead items on the page).
98 *
99 * After pruning and freezing, pages that are newly all-visible and all-frozen
100 * are marked as such in the visibility map.
101 *
102 * Dead TID Storage:
103 *
104 * The major space usage for vacuuming is storage for the dead tuple IDs that
105 * are to be removed from indexes. We want to ensure we can vacuum even the
106 * very largest relations with finite memory space usage. To do that, we set
107 * upper bounds on the memory that can be used for keeping track of dead TIDs
108 * at once.
109 *
110 * We are willing to use at most maintenance_work_mem (or perhaps
111 * autovacuum_work_mem) memory space to keep track of dead TIDs. If the
112 * TID store is full, we must call lazy_vacuum to vacuum indexes (and to vacuum
113 * the pages that we've pruned). This frees up the memory space dedicated to
114 * store dead TIDs.
115 *
116 * In practice VACUUM will often complete its initial pass over the target
117 * heap relation without ever running out of space to store TIDs. This means
118 * that there only needs to be one call to lazy_vacuum, after the initial pass
119 * completes.
120 *
121 * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
122 * Portions Copyright (c) 1994, Regents of the University of California
123 *
124 *
125 * IDENTIFICATION
126 * src/backend/access/heap/vacuumlazy.c
127 *
128 *-------------------------------------------------------------------------
129 */
130#include "postgres.h"
131
132#include <math.h>
133
134#include "access/genam.h"
135#include "access/heapam.h"
136#include "access/htup_details.h"
137#include "access/multixact.h"
138#include "access/tidstore.h"
139#include "access/transam.h"
140#include "access/visibilitymap.h"
141#include "access/xloginsert.h"
142#include "catalog/storage.h"
143#include "commands/dbcommands.h"
144#include "commands/progress.h"
145#include "commands/vacuum.h"
146#include "common/int.h"
147#include "common/pg_prng.h"
148#include "executor/instrument.h"
149#include "miscadmin.h"
150#include "pgstat.h"
153#include "storage/bufmgr.h"
154#include "storage/freespace.h"
155#include "storage/lmgr.h"
156#include "storage/read_stream.h"
157#include "utils/lsyscache.h"
158#include "utils/pg_rusage.h"
159#include "utils/timestamp.h"
160
161
162/*
163 * Space/time tradeoff parameters: do these need to be user-tunable?
164 *
165 * To consider truncating the relation, we want there to be at least
166 * REL_TRUNCATE_MINIMUM or (relsize / REL_TRUNCATE_FRACTION) (whichever
167 * is less) potentially-freeable pages.
168 */
169#define REL_TRUNCATE_MINIMUM 1000
170#define REL_TRUNCATE_FRACTION 16
171
172/*
173 * Timing parameters for truncate locking heuristics.
174 *
175 * These were not exposed as user tunable GUC values because it didn't seem
176 * that the potential for improvement was great enough to merit the cost of
177 * supporting them.
178 */
179#define VACUUM_TRUNCATE_LOCK_CHECK_INTERVAL 20 /* ms */
180#define VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL 50 /* ms */
181#define VACUUM_TRUNCATE_LOCK_TIMEOUT 5000 /* ms */
182
183/*
184 * Threshold that controls whether we bypass index vacuuming and heap
185 * vacuuming as an optimization
186 */
187#define BYPASS_THRESHOLD_PAGES 0.02 /* i.e. 2% of rel_pages */
188
189/*
190 * Perform a failsafe check each time we scan another 4GB of pages.
191 * (Note that this is deliberately kept to a power-of-two, usually 2^19.)
192 */
193#define FAILSAFE_EVERY_PAGES \
194 ((BlockNumber) (((uint64) 4 * 1024 * 1024 * 1024) / BLCKSZ))
195
196/*
197 * When a table has no indexes, vacuum the FSM after every 8GB, approximately
198 * (it won't be exact because we only vacuum FSM after processing a heap page
199 * that has some removable tuples). When there are indexes, this is ignored,
200 * and we vacuum FSM after each index/heap cleaning pass.
201 */
202#define VACUUM_FSM_EVERY_PAGES \
203 ((BlockNumber) (((uint64) 8 * 1024 * 1024 * 1024) / BLCKSZ))
204
205/*
206 * Before we consider skipping a page that's marked as clean in
207 * visibility map, we must've seen at least this many clean pages.
208 */
209#define SKIP_PAGES_THRESHOLD ((BlockNumber) 32)
210
211/*
212 * Size of the prefetch window for lazy vacuum backwards truncation scan.
213 * Needs to be a power of 2.
214 */
215#define PREFETCH_SIZE ((BlockNumber) 32)
216
217/*
218 * Macro to check if we are in a parallel vacuum. If true, we are in the
219 * parallel mode and the DSM segment is initialized.
220 */
221#define ParallelVacuumIsActive(vacrel) ((vacrel)->pvs != NULL)
222
223/* Phases of vacuum during which we report error context. */
224typedef enum
225{
233
234/*
235 * An eager scan of a page that is set all-frozen in the VM is considered
236 * "successful". To spread out freezing overhead across multiple normal
237 * vacuums, we limit the number of successful eager page freezes. The maximum
238 * number of eager page freezes is calculated as a ratio of the all-visible
239 * but not all-frozen pages at the beginning of the vacuum.
240 */
241#define MAX_EAGER_FREEZE_SUCCESS_RATE 0.2
242
243/*
244 * On the assumption that different regions of the table tend to have
245 * similarly aged data, once vacuum fails to freeze
246 * vacuum_max_eager_freeze_failure_rate of the blocks in a region of size
247 * EAGER_SCAN_REGION_SIZE, it suspends eager scanning until it has progressed
248 * to another region of the table with potentially older data.
249 */
250#define EAGER_SCAN_REGION_SIZE 4096
251
252/*
253 * heap_vac_scan_next_block() sets these flags to communicate information
254 * about the block it read to the caller.
255 */
256#define VAC_BLK_WAS_EAGER_SCANNED (1 << 0)
257#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM (1 << 1)
258
259typedef struct LVRelState
260{
261 /* Target heap relation and its indexes */
265
266 /* Buffer access strategy and parallel vacuum state */
269
270 /* Aggressive VACUUM? (must set relfrozenxid >= FreezeLimit) */
272 /* Use visibility map to skip? (disabled by DISABLE_PAGE_SKIPPING) */
274 /* Consider index vacuuming bypass optimization? */
276
277 /* Doing index vacuuming, index cleanup, rel truncation? */
281
282 /* VACUUM operation's cutoffs for freezing and pruning */
285 /* Tracks oldest extant XID/MXID for setting relfrozenxid/relminmxid */
289
290 /* Error reporting state */
291 char *dbname;
293 char *relname;
294 char *indname; /* Current index name */
295 BlockNumber blkno; /* used only for heap operations */
296 OffsetNumber offnum; /* used only for heap operations */
298 bool verbose; /* VACUUM VERBOSE? */
299
300 /*
301 * dead_items stores TIDs whose index tuples are deleted by index
302 * vacuuming. Each TID points to an LP_DEAD line pointer from a heap page
303 * that has been processed by lazy_scan_prune. Also needed by
304 * lazy_vacuum_heap_rel, which marks the same LP_DEAD line pointers as
305 * LP_UNUSED during second heap pass.
306 *
307 * Both dead_items and dead_items_info are allocated in shared memory in
308 * parallel vacuum cases.
309 */
310 TidStore *dead_items; /* TIDs whose index tuples we'll delete */
312
313 BlockNumber rel_pages; /* total number of pages */
314 BlockNumber scanned_pages; /* # pages examined (not skipped via VM) */
315
316 /*
317 * Count of all-visible blocks eagerly scanned (for logging only). This
318 * does not include skippable blocks scanned due to SKIP_PAGES_THRESHOLD.
319 */
321
322 BlockNumber removed_pages; /* # pages removed by relation truncation */
323 BlockNumber new_frozen_tuple_pages; /* # pages with newly frozen tuples */
324
325 /* # pages newly set all-visible in the VM */
327
328 /*
329 * # pages newly set all-visible and all-frozen in the VM. This is a
330 * subset of vm_new_visible_pages. That is, vm_new_visible_pages includes
331 * all pages set all-visible, but vm_new_visible_frozen_pages includes
332 * only those which were also set all-frozen.
333 */
335
336 /* # all-visible pages newly set all-frozen in the VM */
338
339 BlockNumber lpdead_item_pages; /* # pages with LP_DEAD items */
340 BlockNumber missed_dead_pages; /* # pages with missed dead tuples */
341 BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
342
343 /* Statistics output by us, for table */
344 double new_rel_tuples; /* new estimated total # of tuples */
345 double new_live_tuples; /* new estimated total # of live tuples */
346 /* Statistics output by index AMs */
348
349 /* Instrumentation counters */
351 /* Counters that follow are only for scanned_pages */
352 int64 tuples_deleted; /* # deleted from table */
353 int64 tuples_frozen; /* # newly frozen */
354 int64 lpdead_items; /* # deleted from indexes */
355 int64 live_tuples; /* # live tuples remaining */
356 int64 recently_dead_tuples; /* # dead, but not yet removable */
357 int64 missed_dead_tuples; /* # removable, but not removed */
358
359 /* State maintained by heap_vac_scan_next_block() */
360 BlockNumber current_block; /* last block returned */
361 BlockNumber next_unskippable_block; /* next unskippable block */
362 bool next_unskippable_allvis; /* its visibility status */
363 bool next_unskippable_eager_scanned; /* if it was eagerly scanned */
364 Buffer next_unskippable_vmbuffer; /* buffer containing its VM bit */
365
366 /* State related to managing eager scanning of all-visible pages */
367
368 /*
369 * A normal vacuum that has failed to freeze too many eagerly scanned
370 * blocks in a region suspends eager scanning.
371 * next_eager_scan_region_start is the block number of the first block
372 * eligible for resumed eager scanning.
373 *
374 * When eager scanning is permanently disabled, either initially
375 * (including for aggressive vacuum) or due to hitting the success cap,
376 * this is set to InvalidBlockNumber.
377 */
379
380 /*
381 * The remaining number of blocks a normal vacuum will consider eager
382 * scanning when it is successful. When eager scanning is enabled, this is
383 * initialized to MAX_EAGER_FREEZE_SUCCESS_RATE of the total number of
384 * all-visible but not all-frozen pages. For each eager freeze success,
385 * this is decremented. Once it hits 0, eager scanning is permanently
386 * disabled. It is initialized to 0 if eager scanning starts out disabled
387 * (including for aggressive vacuum).
388 */
390
391 /*
392 * The maximum number of blocks which may be eagerly scanned and not
393 * frozen before eager scanning is temporarily suspended. This is
394 * configurable both globally, via the
395 * vacuum_max_eager_freeze_failure_rate GUC, and per table, with a table
396 * storage parameter of the same name. It is calculated as
397 * vacuum_max_eager_freeze_failure_rate of EAGER_SCAN_REGION_SIZE blocks.
398 * It is 0 when eager scanning is disabled.
399 */
401
402 /*
403 * The number of eagerly scanned blocks vacuum failed to freeze (due to
404 * age) in the current eager scan region. Vacuum resets it to
405 * eager_scan_max_fails_per_region each time it enters a new region of the
406 * relation. If eager_scan_remaining_fails hits 0, eager scanning is
407 * suspended until the next region. It is also 0 if eager scanning has
408 * been permanently disabled.
409 */
412
413
414/* Struct for saving and restoring vacuum error information. */
415typedef struct LVSavedErrInfo
416{
421
422
423/* non-export function prototypes */
424static void lazy_scan_heap(LVRelState *vacrel);
425static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
426 VacuumParams *params);
428 void *callback_private_data,
429 void *per_buffer_data);
430static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
431static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
432 BlockNumber blkno, Page page,
433 bool sharelock, Buffer vmbuffer);
434static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
435 BlockNumber blkno, Page page,
436 Buffer vmbuffer, bool all_visible_according_to_vm,
437 bool *has_lpdead_items, bool *vm_page_frozen);
438static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
439 BlockNumber blkno, Page page,
440 bool *has_lpdead_items);
441static void lazy_vacuum(LVRelState *vacrel);
442static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
443static void lazy_vacuum_heap_rel(LVRelState *vacrel);
444static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
445 Buffer buffer, OffsetNumber *deadoffsets,
446 int num_offsets, Buffer vmbuffer);
447static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
448static void lazy_cleanup_all_indexes(LVRelState *vacrel);
451 double reltuples,
452 LVRelState *vacrel);
455 double reltuples,
456 bool estimated_count,
457 LVRelState *vacrel);
458static bool should_attempt_truncation(LVRelState *vacrel);
459static void lazy_truncate_heap(LVRelState *vacrel);
461 bool *lock_waiter_detected);
462static void dead_items_alloc(LVRelState *vacrel, int nworkers);
463static void dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *offsets,
464 int num_offsets);
465static void dead_items_reset(LVRelState *vacrel);
466static void dead_items_cleanup(LVRelState *vacrel);
467static bool heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
468 TransactionId *visibility_cutoff_xid, bool *all_frozen);
469static void update_relstats_all_indexes(LVRelState *vacrel);
470static void vacuum_error_callback(void *arg);
471static void update_vacuum_error_info(LVRelState *vacrel,
472 LVSavedErrInfo *saved_vacrel,
473 int phase, BlockNumber blkno,
474 OffsetNumber offnum);
475static void restore_vacuum_error_info(LVRelState *vacrel,
476 const LVSavedErrInfo *saved_vacrel);
477
478
479
480/*
481 * Helper to set up the eager scanning state for vacuuming a single relation.
482 * Initializes the eager scan management related members of the LVRelState.
483 *
484 * Caller provides whether or not an aggressive vacuum is required due to
485 * vacuum options or for relfrozenxid/relminmxid advancement.
486 */
487static void
489{
490 uint32 randseed;
491 BlockNumber allvisible;
492 BlockNumber allfrozen;
493 float first_region_ratio;
494 bool oldest_unfrozen_before_cutoff = false;
495
496 /*
497 * Initialize eager scan management fields to their disabled values.
498 * Aggressive vacuums, normal vacuums of small tables, and normal vacuums
499 * of tables without sufficiently old tuples disable eager scanning.
500 */
503 vacrel->eager_scan_remaining_fails = 0;
505
506 /* If eager scanning is explicitly disabled, just return. */
507 if (params->max_eager_freeze_failure_rate == 0)
508 return;
509
510 /*
511 * The caller will have determined whether or not an aggressive vacuum is
512 * required by either the vacuum parameters or the relative age of the
513 * oldest unfrozen transaction IDs. An aggressive vacuum must scan every
514 * all-visible page to safely advance the relfrozenxid and/or relminmxid,
515 * so scans of all-visible pages are not considered eager.
516 */
517 if (vacrel->aggressive)
518 return;
519
520 /*
521 * Aggressively vacuuming a small relation shouldn't take long, so it
522 * isn't worth amortizing. We use two times the region size as the size
523 * cutoff because the eager scan start block is a random spot somewhere in
524 * the first region, making the second region the first to be eager
525 * scanned normally.
526 */
527 if (vacrel->rel_pages < 2 * EAGER_SCAN_REGION_SIZE)
528 return;
529
530 /*
531 * We only want to enable eager scanning if we are likely to be able to
532 * freeze some of the pages in the relation.
533 *
534 * Tuples with XIDs older than OldestXmin or MXIDs older than OldestMxact
535 * are technically freezable, but we won't freeze them unless the criteria
536 * for opportunistic freezing is met. Only tuples with XIDs/MXIDs older
537 * than the the FreezeLimit/MultiXactCutoff are frozen in the common case.
538 *
539 * So, as a heuristic, we wait until the FreezeLimit has advanced past the
540 * relfrozenxid or the MultiXactCutoff has advanced past the relminmxid to
541 * enable eager scanning.
542 */
545 vacrel->cutoffs.FreezeLimit))
546 oldest_unfrozen_before_cutoff = true;
547
548 if (!oldest_unfrozen_before_cutoff &&
551 vacrel->cutoffs.MultiXactCutoff))
552 oldest_unfrozen_before_cutoff = true;
553
554 if (!oldest_unfrozen_before_cutoff)
555 return;
556
557 /* We have met the criteria to eagerly scan some pages. */
558
559 /*
560 * Our success cap is MAX_EAGER_FREEZE_SUCCESS_RATE of the number of
561 * all-visible but not all-frozen blocks in the relation.
562 */
563 visibilitymap_count(vacrel->rel, &allvisible, &allfrozen);
564
567 (allvisible - allfrozen));
568
569 /* If every all-visible page is frozen, eager scanning is disabled. */
570 if (vacrel->eager_scan_remaining_successes == 0)
571 return;
572
573 /*
574 * Now calculate the bounds of the first eager scan region. Its end block
575 * will be a random spot somewhere in the first EAGER_SCAN_REGION_SIZE
576 * blocks. This affects the bounds of all subsequent regions and avoids
577 * eager scanning and failing to freeze the same blocks each vacuum of the
578 * relation.
579 */
581
583
585 params->max_eager_freeze_failure_rate <= 1);
586
590
591 /*
592 * The first region will be smaller than subsequent regions. As such,
593 * adjust the eager freeze failures tolerated for this region.
594 */
595 first_region_ratio = 1 - (float) vacrel->next_eager_scan_region_start /
597
600 first_region_ratio;
601}
602
603/*
604 * heap_vacuum_rel() -- perform VACUUM for one heap relation
605 *
606 * This routine sets things up for and then calls lazy_scan_heap, where
607 * almost all work actually takes place. Finalizes everything after call
608 * returns by managing relation truncation and updating rel's pg_class
609 * entry. (Also updates pg_class entries for any indexes that need it.)
610 *
611 * At entry, we have already established a transaction and opened
612 * and locked the relation.
613 */
614void
616 BufferAccessStrategy bstrategy)
617{
618 LVRelState *vacrel;
619 bool verbose,
620 instrument,
621 skipwithvm,
622 frozenxid_updated,
623 minmulti_updated;
624 BlockNumber orig_rel_pages,
625 new_rel_pages,
626 new_rel_allvisible;
627 PGRUsage ru0;
628 TimestampTz starttime = 0;
629 PgStat_Counter startreadtime = 0,
630 startwritetime = 0;
631 WalUsage startwalusage = pgWalUsage;
632 BufferUsage startbufferusage = pgBufferUsage;
633 ErrorContextCallback errcallback;
634 char **indnames = NULL;
635
636 verbose = (params->options & VACOPT_VERBOSE) != 0;
637 instrument = (verbose || (AmAutoVacuumWorkerProcess() &&
638 params->log_min_duration >= 0));
639 if (instrument)
640 {
641 pg_rusage_init(&ru0);
642 if (track_io_timing)
643 {
644 startreadtime = pgStatBlockReadTime;
645 startwritetime = pgStatBlockWriteTime;
646 }
647 }
648
649 /* Used for instrumentation and stats report */
650 starttime = GetCurrentTimestamp();
651
653 RelationGetRelid(rel));
654
655 /*
656 * Setup error traceback support for ereport() first. The idea is to set
657 * up an error context callback to display additional information on any
658 * error during a vacuum. During different phases of vacuum, we update
659 * the state so that the error context callback always display current
660 * information.
661 *
662 * Copy the names of heap rel into local memory for error reporting
663 * purposes, too. It isn't always safe to assume that we can get the name
664 * of each rel. It's convenient for code in lazy_scan_heap to always use
665 * these temp copies.
666 */
667 vacrel = (LVRelState *) palloc0(sizeof(LVRelState));
671 vacrel->indname = NULL;
673 vacrel->verbose = verbose;
674 errcallback.callback = vacuum_error_callback;
675 errcallback.arg = vacrel;
676 errcallback.previous = error_context_stack;
677 error_context_stack = &errcallback;
678
679 /* Set up high level stuff about rel and its indexes */
680 vacrel->rel = rel;
682 &vacrel->indrels);
683 vacrel->bstrategy = bstrategy;
684 if (instrument && vacrel->nindexes > 0)
685 {
686 /* Copy index names used by instrumentation (not error reporting) */
687 indnames = palloc(sizeof(char *) * vacrel->nindexes);
688 for (int i = 0; i < vacrel->nindexes; i++)
689 indnames[i] = pstrdup(RelationGetRelationName(vacrel->indrels[i]));
690 }
691
692 /*
693 * The index_cleanup param either disables index vacuuming and cleanup or
694 * forces it to go ahead when we would otherwise apply the index bypass
695 * optimization. The default is 'auto', which leaves the final decision
696 * up to lazy_vacuum().
697 *
698 * The truncate param allows user to avoid attempting relation truncation,
699 * though it can't force truncation to happen.
700 */
703 params->truncate != VACOPTVALUE_AUTO);
704
705 /*
706 * While VacuumFailSafeActive is reset to false before calling this, we
707 * still need to reset it here due to recursive calls.
708 */
709 VacuumFailsafeActive = false;
710 vacrel->consider_bypass_optimization = true;
711 vacrel->do_index_vacuuming = true;
712 vacrel->do_index_cleanup = true;
713 vacrel->do_rel_truncate = (params->truncate != VACOPTVALUE_DISABLED);
714 if (params->index_cleanup == VACOPTVALUE_DISABLED)
715 {
716 /* Force disable index vacuuming up-front */
717 vacrel->do_index_vacuuming = false;
718 vacrel->do_index_cleanup = false;
719 }
720 else if (params->index_cleanup == VACOPTVALUE_ENABLED)
721 {
722 /* Force index vacuuming. Note that failsafe can still bypass. */
723 vacrel->consider_bypass_optimization = false;
724 }
725 else
726 {
727 /* Default/auto, make all decisions dynamically */
729 }
730
731 /* Initialize page counters explicitly (be tidy) */
732 vacrel->scanned_pages = 0;
733 vacrel->eager_scanned_pages = 0;
734 vacrel->removed_pages = 0;
735 vacrel->new_frozen_tuple_pages = 0;
736 vacrel->lpdead_item_pages = 0;
737 vacrel->missed_dead_pages = 0;
738 vacrel->nonempty_pages = 0;
739 /* dead_items_alloc allocates vacrel->dead_items later on */
740
741 /* Allocate/initialize output statistics state */
742 vacrel->new_rel_tuples = 0;
743 vacrel->new_live_tuples = 0;
744 vacrel->indstats = (IndexBulkDeleteResult **)
745 palloc0(vacrel->nindexes * sizeof(IndexBulkDeleteResult *));
746
747 /* Initialize remaining counters (be tidy) */
748 vacrel->num_index_scans = 0;
749 vacrel->tuples_deleted = 0;
750 vacrel->tuples_frozen = 0;
751 vacrel->lpdead_items = 0;
752 vacrel->live_tuples = 0;
753 vacrel->recently_dead_tuples = 0;
754 vacrel->missed_dead_tuples = 0;
755
756 vacrel->vm_new_visible_pages = 0;
757 vacrel->vm_new_visible_frozen_pages = 0;
758 vacrel->vm_new_frozen_pages = 0;
759 vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);
760
761 /*
762 * Get cutoffs that determine which deleted tuples are considered DEAD,
763 * not just RECENTLY_DEAD, and which XIDs/MXIDs to freeze. Then determine
764 * the extent of the blocks that we'll scan in lazy_scan_heap. It has to
765 * happen in this order to ensure that the OldestXmin cutoff field works
766 * as an upper bound on the XIDs stored in the pages we'll actually scan
767 * (NewRelfrozenXid tracking must never be allowed to miss unfrozen XIDs).
768 *
769 * Next acquire vistest, a related cutoff that's used in pruning. We use
770 * vistest in combination with OldestXmin to ensure that
771 * heap_page_prune_and_freeze() always removes any deleted tuple whose
772 * xmax is < OldestXmin. lazy_scan_prune must never become confused about
773 * whether a tuple should be frozen or removed. (In the future we might
774 * want to teach lazy_scan_prune to recompute vistest from time to time,
775 * to increase the number of dead tuples it can prune away.)
776 */
777 vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs);
778 vacrel->vistest = GlobalVisTestFor(rel);
779 /* Initialize state used to track oldest extant XID/MXID */
780 vacrel->NewRelfrozenXid = vacrel->cutoffs.OldestXmin;
781 vacrel->NewRelminMxid = vacrel->cutoffs.OldestMxact;
782
783 /*
784 * Initialize state related to tracking all-visible page skipping. This is
785 * very important to determine whether or not it is safe to advance the
786 * relfrozenxid/relminmxid.
787 */
788 vacrel->skippedallvis = false;
789 skipwithvm = true;
791 {
792 /*
793 * Force aggressive mode, and disable skipping blocks using the
794 * visibility map (even those set all-frozen)
795 */
796 vacrel->aggressive = true;
797 skipwithvm = false;
798 }
799
800 vacrel->skipwithvm = skipwithvm;
801
802 /*
803 * Set up eager scan tracking state. This must happen after determining
804 * whether or not the vacuum must be aggressive, because only normal
805 * vacuums use the eager scan algorithm.
806 */
807 heap_vacuum_eager_scan_setup(vacrel, params);
808
809 if (verbose)
810 {
811 if (vacrel->aggressive)
813 (errmsg("aggressively vacuuming \"%s.%s.%s\"",
814 vacrel->dbname, vacrel->relnamespace,
815 vacrel->relname)));
816 else
818 (errmsg("vacuuming \"%s.%s.%s\"",
819 vacrel->dbname, vacrel->relnamespace,
820 vacrel->relname)));
821 }
822
823 /*
824 * Allocate dead_items memory using dead_items_alloc. This handles
825 * parallel VACUUM initialization as part of allocating shared memory
826 * space used for dead_items. (But do a failsafe precheck first, to
827 * ensure that parallel VACUUM won't be attempted at all when relfrozenxid
828 * is already dangerously old.)
829 */
831 dead_items_alloc(vacrel, params->nworkers);
832
833 /*
834 * Call lazy_scan_heap to perform all required heap pruning, index
835 * vacuuming, and heap vacuuming (plus related processing)
836 */
837 lazy_scan_heap(vacrel);
838
839 /*
840 * Free resources managed by dead_items_alloc. This ends parallel mode in
841 * passing when necessary.
842 */
843 dead_items_cleanup(vacrel);
845
846 /*
847 * Update pg_class entries for each of rel's indexes where appropriate.
848 *
849 * Unlike the later update to rel's pg_class entry, this is not critical.
850 * Maintains relpages/reltuples statistics used by the planner only.
851 */
852 if (vacrel->do_index_cleanup)
854
855 /* Done with rel's indexes */
856 vac_close_indexes(vacrel->nindexes, vacrel->indrels, NoLock);
857
858 /* Optionally truncate rel */
859 if (should_attempt_truncation(vacrel))
860 lazy_truncate_heap(vacrel);
861
862 /* Pop the error context stack */
863 error_context_stack = errcallback.previous;
864
865 /* Report that we are now doing final cleanup */
868
869 /*
870 * Prepare to update rel's pg_class entry.
871 *
872 * Aggressive VACUUMs must always be able to advance relfrozenxid to a
873 * value >= FreezeLimit, and relminmxid to a value >= MultiXactCutoff.
874 * Non-aggressive VACUUMs may advance them by any amount, or not at all.
875 */
876 Assert(vacrel->NewRelfrozenXid == vacrel->cutoffs.OldestXmin ||
878 vacrel->cutoffs.relfrozenxid,
879 vacrel->NewRelfrozenXid));
880 Assert(vacrel->NewRelminMxid == vacrel->cutoffs.OldestMxact ||
882 vacrel->cutoffs.relminmxid,
883 vacrel->NewRelminMxid));
884 if (vacrel->skippedallvis)
885 {
886 /*
887 * Must keep original relfrozenxid in a non-aggressive VACUUM that
888 * chose to skip an all-visible page range. The state that tracks new
889 * values will have missed unfrozen XIDs from the pages we skipped.
890 */
891 Assert(!vacrel->aggressive);
894 }
895
896 /*
897 * For safety, clamp relallvisible to be not more than what we're setting
898 * pg_class.relpages to
899 */
900 new_rel_pages = vacrel->rel_pages; /* After possible rel truncation */
901 visibilitymap_count(rel, &new_rel_allvisible, NULL);
902 if (new_rel_allvisible > new_rel_pages)
903 new_rel_allvisible = new_rel_pages;
904
905 /*
906 * Now actually update rel's pg_class entry.
907 *
908 * In principle new_live_tuples could be -1 indicating that we (still)
909 * don't know the tuple count. In practice that can't happen, since we
910 * scan every page that isn't skipped using the visibility map.
911 */
912 vac_update_relstats(rel, new_rel_pages, vacrel->new_live_tuples,
913 new_rel_allvisible, vacrel->nindexes > 0,
914 vacrel->NewRelfrozenXid, vacrel->NewRelminMxid,
915 &frozenxid_updated, &minmulti_updated, false);
916
917 /*
918 * Report results to the cumulative stats system, too.
919 *
920 * Deliberately avoid telling the stats system about LP_DEAD items that
921 * remain in the table due to VACUUM bypassing index and heap vacuuming.
922 * ANALYZE will consider the remaining LP_DEAD items to be dead "tuples".
923 * It seems like a good idea to err on the side of not vacuuming again too
924 * soon in cases where the failsafe prevented significant amounts of heap
925 * vacuuming.
926 */
928 rel->rd_rel->relisshared,
929 Max(vacrel->new_live_tuples, 0),
930 vacrel->recently_dead_tuples +
931 vacrel->missed_dead_tuples,
932 starttime);
934
935 if (instrument)
936 {
938
939 if (verbose || params->log_min_duration == 0 ||
940 TimestampDifferenceExceeds(starttime, endtime,
941 params->log_min_duration))
942 {
943 long secs_dur;
944 int usecs_dur;
945 WalUsage walusage;
946 BufferUsage bufferusage;
948 char *msgfmt;
949 int32 diff;
950 double read_rate = 0,
951 write_rate = 0;
952 int64 total_blks_hit;
953 int64 total_blks_read;
954 int64 total_blks_dirtied;
955
956 TimestampDifference(starttime, endtime, &secs_dur, &usecs_dur);
957 memset(&walusage, 0, sizeof(WalUsage));
958 WalUsageAccumDiff(&walusage, &pgWalUsage, &startwalusage);
959 memset(&bufferusage, 0, sizeof(BufferUsage));
960 BufferUsageAccumDiff(&bufferusage, &pgBufferUsage, &startbufferusage);
961
962 total_blks_hit = bufferusage.shared_blks_hit +
963 bufferusage.local_blks_hit;
964 total_blks_read = bufferusage.shared_blks_read +
965 bufferusage.local_blks_read;
966 total_blks_dirtied = bufferusage.shared_blks_dirtied +
967 bufferusage.local_blks_dirtied;
968
970 if (verbose)
971 {
972 /*
973 * Aggressiveness already reported earlier, in dedicated
974 * VACUUM VERBOSE ereport
975 */
976 Assert(!params->is_wraparound);
977 msgfmt = _("finished vacuuming \"%s.%s.%s\": index scans: %d\n");
978 }
979 else if (params->is_wraparound)
980 {
981 /*
982 * While it's possible for a VACUUM to be both is_wraparound
983 * and !aggressive, that's just a corner-case -- is_wraparound
984 * implies aggressive. Produce distinct output for the corner
985 * case all the same, just in case.
986 */
987 if (vacrel->aggressive)
988 msgfmt = _("automatic aggressive vacuum to prevent wraparound of table \"%s.%s.%s\": index scans: %d\n");
989 else
990 msgfmt = _("automatic vacuum to prevent wraparound of table \"%s.%s.%s\": index scans: %d\n");
991 }
992 else
993 {
994 if (vacrel->aggressive)
995 msgfmt = _("automatic aggressive vacuum of table \"%s.%s.%s\": index scans: %d\n");
996 else
997 msgfmt = _("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n");
998 }
999 appendStringInfo(&buf, msgfmt,
1000 vacrel->dbname,
1001 vacrel->relnamespace,
1002 vacrel->relname,
1003 vacrel->num_index_scans);
1004 appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eagerly scanned\n"),
1005 vacrel->removed_pages,
1006 new_rel_pages,
1007 vacrel->scanned_pages,
1008 orig_rel_pages == 0 ? 100.0 :
1009 100.0 * vacrel->scanned_pages /
1010 orig_rel_pages,
1011 vacrel->eager_scanned_pages);
1013 _("tuples: %lld removed, %lld remain, %lld are dead but not yet removable\n"),
1014 (long long) vacrel->tuples_deleted,
1015 (long long) vacrel->new_rel_tuples,
1016 (long long) vacrel->recently_dead_tuples);
1017 if (vacrel->missed_dead_tuples > 0)
1019 _("tuples missed: %lld dead from %u pages not removed due to cleanup lock contention\n"),
1020 (long long) vacrel->missed_dead_tuples,
1021 vacrel->missed_dead_pages);
1022 diff = (int32) (ReadNextTransactionId() -
1023 vacrel->cutoffs.OldestXmin);
1025 _("removable cutoff: %u, which was %d XIDs old when operation ended\n"),
1026 vacrel->cutoffs.OldestXmin, diff);
1027 if (frozenxid_updated)
1028 {
1029 diff = (int32) (vacrel->NewRelfrozenXid -
1030 vacrel->cutoffs.relfrozenxid);
1032 _("new relfrozenxid: %u, which is %d XIDs ahead of previous value\n"),
1033 vacrel->NewRelfrozenXid, diff);
1034 }
1035 if (minmulti_updated)
1036 {
1037 diff = (int32) (vacrel->NewRelminMxid -
1038 vacrel->cutoffs.relminmxid);
1040 _("new relminmxid: %u, which is %d MXIDs ahead of previous value\n"),
1041 vacrel->NewRelminMxid, diff);
1042 }
1043 appendStringInfo(&buf, _("frozen: %u pages from table (%.2f%% of total) had %lld tuples frozen\n"),
1044 vacrel->new_frozen_tuple_pages,
1045 orig_rel_pages == 0 ? 100.0 :
1046 100.0 * vacrel->new_frozen_tuple_pages /
1047 orig_rel_pages,
1048 (long long) vacrel->tuples_frozen);
1049
1051 _("visibility map: %u pages set all-visible, %u pages set all-frozen (%u were all-visible)\n"),
1052 vacrel->vm_new_visible_pages,
1054 vacrel->vm_new_frozen_pages,
1055 vacrel->vm_new_frozen_pages);
1056 if (vacrel->do_index_vacuuming)
1057 {
1058 if (vacrel->nindexes == 0 || vacrel->num_index_scans == 0)
1059 appendStringInfoString(&buf, _("index scan not needed: "));
1060 else
1061 appendStringInfoString(&buf, _("index scan needed: "));
1062
1063 msgfmt = _("%u pages from table (%.2f%% of total) had %lld dead item identifiers removed\n");
1064 }
1065 else
1066 {
1068 appendStringInfoString(&buf, _("index scan bypassed: "));
1069 else
1070 appendStringInfoString(&buf, _("index scan bypassed by failsafe: "));
1071
1072 msgfmt = _("%u pages from table (%.2f%% of total) have %lld dead item identifiers\n");
1073 }
1074 appendStringInfo(&buf, msgfmt,
1075 vacrel->lpdead_item_pages,
1076 orig_rel_pages == 0 ? 100.0 :
1077 100.0 * vacrel->lpdead_item_pages / orig_rel_pages,
1078 (long long) vacrel->lpdead_items);
1079 for (int i = 0; i < vacrel->nindexes; i++)
1080 {
1081 IndexBulkDeleteResult *istat = vacrel->indstats[i];
1082
1083 if (!istat)
1084 continue;
1085
1087 _("index \"%s\": pages: %u in total, %u newly deleted, %u currently deleted, %u reusable\n"),
1088 indnames[i],
1089 istat->num_pages,
1090 istat->pages_newly_deleted,
1091 istat->pages_deleted,
1092 istat->pages_free);
1093 }
1095 {
1096 /*
1097 * We bypass the changecount mechanism because this value is
1098 * only updated by the calling process. We also rely on the
1099 * above call to pgstat_progress_end_command() to not clear
1100 * the st_progress_param array.
1101 */
1102 appendStringInfo(&buf, _("delay time: %.3f ms\n"),
1104 }
1105 if (track_io_timing)
1106 {
1107 double read_ms = (double) (pgStatBlockReadTime - startreadtime) / 1000;
1108 double write_ms = (double) (pgStatBlockWriteTime - startwritetime) / 1000;
1109
1110 appendStringInfo(&buf, _("I/O timings: read: %.3f ms, write: %.3f ms\n"),
1111 read_ms, write_ms);
1112 }
1113 if (secs_dur > 0 || usecs_dur > 0)
1114 {
1115 read_rate = (double) BLCKSZ * total_blks_read /
1116 (1024 * 1024) / (secs_dur + usecs_dur / 1000000.0);
1117 write_rate = (double) BLCKSZ * total_blks_dirtied /
1118 (1024 * 1024) / (secs_dur + usecs_dur / 1000000.0);
1119 }
1120 appendStringInfo(&buf, _("avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"),
1121 read_rate, write_rate);
1123 _("buffer usage: %lld hits, %lld reads, %lld dirtied\n"),
1124 (long long) total_blks_hit,
1125 (long long) total_blks_read,
1126 (long long) total_blks_dirtied);
1128 _("WAL usage: %lld records, %lld full page images, %llu bytes\n"),
1129 (long long) walusage.wal_records,
1130 (long long) walusage.wal_fpi,
1131 (unsigned long long) walusage.wal_bytes);
1132 appendStringInfo(&buf, _("system usage: %s"), pg_rusage_show(&ru0));
1133
1134 ereport(verbose ? INFO : LOG,
1135 (errmsg_internal("%s", buf.data)));
1136 pfree(buf.data);
1137 }
1138 }
1139
1140 /* Cleanup index statistics and index names */
1141 for (int i = 0; i < vacrel->nindexes; i++)
1142 {
1143 if (vacrel->indstats[i])
1144 pfree(vacrel->indstats[i]);
1145
1146 if (instrument)
1147 pfree(indnames[i]);
1148 }
1149}
1150
1151/*
1152 * lazy_scan_heap() -- workhorse function for VACUUM
1153 *
1154 * This routine prunes each page in the heap, and considers the need to
1155 * freeze remaining tuples with storage (not including pages that can be
1156 * skipped using the visibility map). Also performs related maintenance
1157 * of the FSM and visibility map. These steps all take place during an
1158 * initial pass over the target heap relation.
1159 *
1160 * Also invokes lazy_vacuum_all_indexes to vacuum indexes, which largely
1161 * consists of deleting index tuples that point to LP_DEAD items left in
1162 * heap pages following pruning. Earlier initial pass over the heap will
1163 * have collected the TIDs whose index tuples need to be removed.
1164 *
1165 * Finally, invokes lazy_vacuum_heap_rel to vacuum heap pages, which
1166 * largely consists of marking LP_DEAD items (from vacrel->dead_items)
1167 * as LP_UNUSED. This has to happen in a second, final pass over the
1168 * heap, to preserve a basic invariant that all index AMs rely on: no
1169 * extant index tuple can ever be allowed to contain a TID that points to
1170 * an LP_UNUSED line pointer in the heap. We must disallow premature
1171 * recycling of line pointers to avoid index scans that get confused
1172 * about which TID points to which tuple immediately after recycling.
1173 * (Actually, this isn't a concern when target heap relation happens to
1174 * have no indexes, which allows us to safely apply the one-pass strategy
1175 * as an optimization).
1176 *
1177 * In practice we often have enough space to fit all TIDs, and so won't
1178 * need to call lazy_vacuum more than once, after our initial pass over
1179 * the heap has totally finished. Otherwise things are slightly more
1180 * complicated: our "initial pass" over the heap applies only to those
1181 * pages that were pruned before we needed to call lazy_vacuum, and our
1182 * "final pass" over the heap only vacuums these same heap pages.
1183 * However, we process indexes in full every time lazy_vacuum is called,
1184 * which makes index processing very inefficient when memory is in short
1185 * supply.
1186 */
1187static void
1189{
1190 ReadStream *stream;
1191 BlockNumber rel_pages = vacrel->rel_pages,
1192 blkno = 0,
1193 next_fsm_block_to_vacuum = 0;
1194 void *per_buffer_data = NULL;
1195 BlockNumber orig_eager_scan_success_limit =
1196 vacrel->eager_scan_remaining_successes; /* for logging */
1197 Buffer vmbuffer = InvalidBuffer;
1198 const int initprog_index[] = {
1202 };
1203 int64 initprog_val[3];
1204
1205 /* Report that we're scanning the heap, advertising total # of blocks */
1206 initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
1207 initprog_val[1] = rel_pages;
1208 initprog_val[2] = vacrel->dead_items_info->max_bytes;
1209 pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
1210
1211 /* Initialize for the first heap_vac_scan_next_block() call */
1214 vacrel->next_unskippable_allvis = false;
1215 vacrel->next_unskippable_eager_scanned = false;
1217
1218 /* Set up the read stream for vacuum's first pass through the heap */
1220 vacrel->bstrategy,
1221 vacrel->rel,
1224 vacrel,
1225 sizeof(uint8));
1226
1227 while (true)
1228 {
1229 Buffer buf;
1230 Page page;
1231 uint8 blk_info = 0;
1232 bool has_lpdead_items;
1233 bool vm_page_frozen = false;
1234 bool got_cleanup_lock = false;
1235
1236 vacuum_delay_point(false);
1237
1238 /*
1239 * Regularly check if wraparound failsafe should trigger.
1240 *
1241 * There is a similar check inside lazy_vacuum_all_indexes(), but
1242 * relfrozenxid might start to look dangerously old before we reach
1243 * that point. This check also provides failsafe coverage for the
1244 * one-pass strategy, and the two-pass strategy with the index_cleanup
1245 * param set to 'off'.
1246 */
1247 if (vacrel->scanned_pages > 0 &&
1248 vacrel->scanned_pages % FAILSAFE_EVERY_PAGES == 0)
1250
1251 /*
1252 * Consider if we definitely have enough space to process TIDs on page
1253 * already. If we are close to overrunning the available space for
1254 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
1255 * this page.
1256 */
1258 {
1259 /*
1260 * Before beginning index vacuuming, we release any pin we may
1261 * hold on the visibility map page. This isn't necessary for
1262 * correctness, but we do it anyway to avoid holding the pin
1263 * across a lengthy, unrelated operation.
1264 */
1265 if (BufferIsValid(vmbuffer))
1266 {
1267 ReleaseBuffer(vmbuffer);
1268 vmbuffer = InvalidBuffer;
1269 }
1270
1271 /* Perform a round of index and heap vacuuming */
1272 vacrel->consider_bypass_optimization = false;
1273 lazy_vacuum(vacrel);
1274
1275 /*
1276 * Vacuum the Free Space Map to make newly-freed space visible on
1277 * upper-level FSM pages. Note that blkno is the previously
1278 * processed block.
1279 */
1280 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
1281 blkno + 1);
1282 next_fsm_block_to_vacuum = blkno;
1283
1284 /* Report that we are once again scanning the heap */
1287 }
1288
1289 buf = read_stream_next_buffer(stream, &per_buffer_data);
1290
1291 /* The relation is exhausted. */
1292 if (!BufferIsValid(buf))
1293 break;
1294
1295 blk_info = *((uint8 *) per_buffer_data);
1297 page = BufferGetPage(buf);
1298 blkno = BufferGetBlockNumber(buf);
1299
1300 vacrel->scanned_pages++;
1301 if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
1302 vacrel->eager_scanned_pages++;
1303
1304 /* Report as block scanned, update error traceback information */
1307 blkno, InvalidOffsetNumber);
1308
1309 /*
1310 * Pin the visibility map page in case we need to mark the page
1311 * all-visible. In most cases this will be very cheap, because we'll
1312 * already have the correct page pinned anyway.
1313 */
1314 visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
1315
1316 /*
1317 * We need a buffer cleanup lock to prune HOT chains and defragment
1318 * the page in lazy_scan_prune. But when it's not possible to acquire
1319 * a cleanup lock right away, we may be able to settle for reduced
1320 * processing using lazy_scan_noprune.
1321 */
1322 got_cleanup_lock = ConditionalLockBufferForCleanup(buf);
1323
1324 if (!got_cleanup_lock)
1326
1327 /* Check for new or empty pages before lazy_scan_[no]prune call */
1328 if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
1329 vmbuffer))
1330 {
1331 /* Processed as new/empty page (lock and pin released) */
1332 continue;
1333 }
1334
1335 /*
1336 * If we didn't get the cleanup lock, we can still collect LP_DEAD
1337 * items in the dead_items area for later vacuuming, count live and
1338 * recently dead tuples for vacuum logging, and determine if this
1339 * block could later be truncated. If we encounter any xid/mxids that
1340 * require advancing the relfrozenxid/relminxid, we'll have to wait
1341 * for a cleanup lock and call lazy_scan_prune().
1342 */
1343 if (!got_cleanup_lock &&
1344 !lazy_scan_noprune(vacrel, buf, blkno, page, &has_lpdead_items))
1345 {
1346 /*
1347 * lazy_scan_noprune could not do all required processing. Wait
1348 * for a cleanup lock, and call lazy_scan_prune in the usual way.
1349 */
1350 Assert(vacrel->aggressive);
1353 got_cleanup_lock = true;
1354 }
1355
1356 /*
1357 * If we have a cleanup lock, we must now prune, freeze, and count
1358 * tuples. We may have acquired the cleanup lock originally, or we may
1359 * have gone back and acquired it after lazy_scan_noprune() returned
1360 * false. Either way, the page hasn't been processed yet.
1361 *
1362 * Like lazy_scan_noprune(), lazy_scan_prune() will count
1363 * recently_dead_tuples and live tuples for vacuum logging, determine
1364 * if the block can later be truncated, and accumulate the details of
1365 * remaining LP_DEAD line pointers on the page into dead_items. These
1366 * dead items include those pruned by lazy_scan_prune() as well as
1367 * line pointers previously marked LP_DEAD.
1368 */
1369 if (got_cleanup_lock)
1370 lazy_scan_prune(vacrel, buf, blkno, page,
1371 vmbuffer,
1373 &has_lpdead_items, &vm_page_frozen);
1374
1375 /*
1376 * Count an eagerly scanned page as a failure or a success.
1377 *
1378 * Only lazy_scan_prune() freezes pages, so if we didn't get the
1379 * cleanup lock, we won't have frozen the page. However, we only count
1380 * pages that were too new to require freezing as eager freeze
1381 * failures.
1382 *
1383 * We could gather more information from lazy_scan_noprune() about
1384 * whether or not there were tuples with XIDs or MXIDs older than the
1385 * FreezeLimit or MultiXactCutoff. However, for simplicity, we simply
1386 * exclude pages skipped due to cleanup lock contention from eager
1387 * freeze algorithm caps.
1388 */
1389 if (got_cleanup_lock &&
1390 (blk_info & VAC_BLK_WAS_EAGER_SCANNED))
1391 {
1392 /* Aggressive vacuums do not eager scan. */
1393 Assert(!vacrel->aggressive);
1394
1395 if (vm_page_frozen)
1396 {
1399
1400 if (vacrel->eager_scan_remaining_successes == 0)
1401 {
1402 /*
1403 * If we hit our success cap, permanently disable eager
1404 * scanning by setting the other eager scan management
1405 * fields to their disabled values.
1406 */
1407 vacrel->eager_scan_remaining_fails = 0;
1410
1411 ereport(vacrel->verbose ? INFO : DEBUG2,
1412 (errmsg("disabling eager scanning after freezing %u eagerly scanned blocks of \"%s.%s.%s\"",
1413 orig_eager_scan_success_limit,
1414 vacrel->dbname, vacrel->relnamespace,
1415 vacrel->relname)));
1416 }
1417 }
1418 else
1419 {
1422 }
1423 }
1424
1425 /*
1426 * Now drop the buffer lock and, potentially, update the FSM.
1427 *
1428 * Our goal is to update the freespace map the last time we touch the
1429 * page. If we'll process a block in the second pass, we may free up
1430 * additional space on the page, so it is better to update the FSM
1431 * after the second pass. If the relation has no indexes, or if index
1432 * vacuuming is disabled, there will be no second heap pass; if this
1433 * particular page has no dead items, the second heap pass will not
1434 * touch this page. So, in those cases, update the FSM now.
1435 *
1436 * Note: In corner cases, it's possible to miss updating the FSM
1437 * entirely. If index vacuuming is currently enabled, we'll skip the
1438 * FSM update now. But if failsafe mode is later activated, or there
1439 * are so few dead tuples that index vacuuming is bypassed, there will
1440 * also be no opportunity to update the FSM later, because we'll never
1441 * revisit this page. Since updating the FSM is desirable but not
1442 * absolutely required, that's OK.
1443 */
1444 if (vacrel->nindexes == 0
1445 || !vacrel->do_index_vacuuming
1446 || !has_lpdead_items)
1447 {
1448 Size freespace = PageGetHeapFreeSpace(page);
1449
1451 RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
1452
1453 /*
1454 * Periodically perform FSM vacuuming to make newly-freed space
1455 * visible on upper FSM pages. This is done after vacuuming if the
1456 * table has indexes. There will only be newly-freed space if we
1457 * held the cleanup lock and lazy_scan_prune() was called.
1458 */
1459 if (got_cleanup_lock && vacrel->nindexes == 0 && has_lpdead_items &&
1460 blkno - next_fsm_block_to_vacuum >= VACUUM_FSM_EVERY_PAGES)
1461 {
1462 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
1463 blkno);
1464 next_fsm_block_to_vacuum = blkno;
1465 }
1466 }
1467 else
1469 }
1470
1471 vacrel->blkno = InvalidBlockNumber;
1472 if (BufferIsValid(vmbuffer))
1473 ReleaseBuffer(vmbuffer);
1474
1475 /*
1476 * Report that everything is now scanned. We never skip scanning the last
1477 * block in the relation, so we can pass rel_pages here.
1478 */
1480 rel_pages);
1481
1482 /* now we can compute the new value for pg_class.reltuples */
1483 vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
1484 vacrel->scanned_pages,
1485 vacrel->live_tuples);
1486
1487 /*
1488 * Also compute the total number of surviving heap entries. In the
1489 * (unlikely) scenario that new_live_tuples is -1, take it as zero.
1490 */
1491 vacrel->new_rel_tuples =
1492 Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
1493 vacrel->missed_dead_tuples;
1494
1495 read_stream_end(stream);
1496
1497 /*
1498 * Do index vacuuming (call each index's ambulkdelete routine), then do
1499 * related heap vacuuming
1500 */
1501 if (vacrel->dead_items_info->num_items > 0)
1502 lazy_vacuum(vacrel);
1503
1504 /*
1505 * Vacuum the remainder of the Free Space Map. We must do this whether or
1506 * not there were indexes, and whether or not we bypassed index vacuuming.
1507 * We can pass rel_pages here because we never skip scanning the last
1508 * block of the relation.
1509 */
1510 if (rel_pages > next_fsm_block_to_vacuum)
1511 FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
1512
1513 /* report all blocks vacuumed */
1515
1516 /* Do final index cleanup (call each index's amvacuumcleanup routine) */
1517 if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
1519}
1520
1521/*
1522 * heap_vac_scan_next_block() -- read stream callback to get the next block
1523 * for vacuum to process
1524 *
1525 * Every time lazy_scan_heap() needs a new block to process during its first
1526 * phase, it invokes read_stream_next_buffer() with a stream set up to call
1527 * heap_vac_scan_next_block() to get the next block.
1528 *
1529 * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
1530 * various thresholds to skip blocks which do not need to be processed and
1531 * returns the next block to process or InvalidBlockNumber if there are no
1532 * remaining blocks.
1533 *
1534 * The visibility status of the next block to process and whether or not it
1535 * was eager scanned is set in the per_buffer_data.
1536 *
1537 * callback_private_data contains a reference to the LVRelState, passed to the
1538 * read stream API during stream setup. The LVRelState is an in/out parameter
1539 * here (locally named `vacrel`). Vacuum options and information about the
1540 * relation are read from it. vacrel->skippedallvis is set if we skip a block
1541 * that's all-visible but not all-frozen (to ensure that we don't update
1542 * relfrozenxid in that case). vacrel also holds information about the next
1543 * unskippable block -- as bookkeeping for this function.
1544 */
1545static BlockNumber
1547 void *callback_private_data,
1548 void *per_buffer_data)
1549{
1550 BlockNumber next_block;
1551 LVRelState *vacrel = callback_private_data;
1552 uint8 blk_info = 0;
1553
1554 /* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
1555 next_block = vacrel->current_block + 1;
1556
1557 /* Have we reached the end of the relation? */
1558 if (next_block >= vacrel->rel_pages)
1559 {
1561 {
1564 }
1565 return InvalidBlockNumber;
1566 }
1567
1568 /*
1569 * We must be in one of the three following states:
1570 */
1571 if (next_block > vacrel->next_unskippable_block ||
1573 {
1574 /*
1575 * 1. We have just processed an unskippable block (or we're at the
1576 * beginning of the scan). Find the next unskippable block using the
1577 * visibility map.
1578 */
1579 bool skipsallvis;
1580
1581 find_next_unskippable_block(vacrel, &skipsallvis);
1582
1583 /*
1584 * We now know the next block that we must process. It can be the
1585 * next block after the one we just processed, or something further
1586 * ahead. If it's further ahead, we can jump to it, but we choose to
1587 * do so only if we can skip at least SKIP_PAGES_THRESHOLD consecutive
1588 * pages. Since we're reading sequentially, the OS should be doing
1589 * readahead for us, so there's no gain in skipping a page now and
1590 * then. Skipping such a range might even discourage sequential
1591 * detection.
1592 *
1593 * This test also enables more frequent relfrozenxid advancement
1594 * during non-aggressive VACUUMs. If the range has any all-visible
1595 * pages then skipping makes updating relfrozenxid unsafe, which is a
1596 * real downside.
1597 */
1598 if (vacrel->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
1599 {
1600 next_block = vacrel->next_unskippable_block;
1601 if (skipsallvis)
1602 vacrel->skippedallvis = true;
1603 }
1604 }
1605
1606 /* Now we must be in one of the two remaining states: */
1607 if (next_block < vacrel->next_unskippable_block)
1608 {
1609 /*
1610 * 2. We are processing a range of blocks that we could have skipped
1611 * but chose not to. We know that they are all-visible in the VM,
1612 * otherwise they would've been unskippable.
1613 */
1614 vacrel->current_block = next_block;
1616 *((uint8 *) per_buffer_data) = blk_info;
1617 return vacrel->current_block;
1618 }
1619 else
1620 {
1621 /*
1622 * 3. We reached the next unskippable block. Process it. On next
1623 * iteration, we will be back in state 1.
1624 */
1625 Assert(next_block == vacrel->next_unskippable_block);
1626
1627 vacrel->current_block = next_block;
1628 if (vacrel->next_unskippable_allvis)
1631 blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
1632 *((uint8 *) per_buffer_data) = blk_info;
1633 return vacrel->current_block;
1634 }
1635}
1636
1637/*
1638 * Find the next unskippable block in a vacuum scan using the visibility map.
1639 * The next unskippable block and its visibility information is updated in
1640 * vacrel.
1641 *
1642 * Note: our opinion of which blocks can be skipped can go stale immediately.
1643 * It's okay if caller "misses" a page whose all-visible or all-frozen marking
1644 * was concurrently cleared, though. All that matters is that caller scan all
1645 * pages whose tuples might contain XIDs < OldestXmin, or MXIDs < OldestMxact.
1646 * (Actually, non-aggressive VACUUMs can choose to skip all-visible pages with
1647 * older XIDs/MXIDs. The *skippedallvis flag will be set here when the choice
1648 * to skip such a range is actually made, making everything safe.)
1649 */
1650static void
1651find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
1652{
1653 BlockNumber rel_pages = vacrel->rel_pages;
1654 BlockNumber next_unskippable_block = vacrel->next_unskippable_block + 1;
1655 Buffer next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer;
1656 bool next_unskippable_eager_scanned = false;
1657 bool next_unskippable_allvis;
1658
1659 *skipsallvis = false;
1660
1661 for (;; next_unskippable_block++)
1662 {
1663 uint8 mapbits = visibilitymap_get_status(vacrel->rel,
1664 next_unskippable_block,
1665 &next_unskippable_vmbuffer);
1666
1667 next_unskippable_allvis = (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0;
1668
1669 /*
1670 * At the start of each eager scan region, normal vacuums with eager
1671 * scanning enabled reset the failure counter, allowing vacuum to
1672 * resume eager scanning if it had been suspended in the previous
1673 * region.
1674 */
1675 if (next_unskippable_block >= vacrel->next_eager_scan_region_start)
1676 {
1680 }
1681
1682 /*
1683 * A block is unskippable if it is not all visible according to the
1684 * visibility map.
1685 */
1686 if (!next_unskippable_allvis)
1687 {
1688 Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
1689 break;
1690 }
1691
1692 /*
1693 * Caller must scan the last page to determine whether it has tuples
1694 * (caller must have the opportunity to set vacrel->nonempty_pages).
1695 * This rule avoids having lazy_truncate_heap() take access-exclusive
1696 * lock on rel to attempt a truncation that fails anyway, just because
1697 * there are tuples on the last page (it is likely that there will be
1698 * tuples on other nearby pages as well, but those can be skipped).
1699 *
1700 * Implement this by always treating the last block as unsafe to skip.
1701 */
1702 if (next_unskippable_block == rel_pages - 1)
1703 break;
1704
1705 /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
1706 if (!vacrel->skipwithvm)
1707 break;
1708
1709 /*
1710 * All-frozen pages cannot contain XIDs < OldestXmin (XIDs that aren't
1711 * already frozen by now), so this page can be skipped.
1712 */
1713 if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)
1714 continue;
1715
1716 /*
1717 * Aggressive vacuums cannot skip any all-visible pages that are not
1718 * also all-frozen.
1719 */
1720 if (vacrel->aggressive)
1721 break;
1722
1723 /*
1724 * Normal vacuums with eager scanning enabled only skip all-visible
1725 * but not all-frozen pages if they have hit the failure limit for the
1726 * current eager scan region.
1727 */
1728 if (vacrel->eager_scan_remaining_fails > 0)
1729 {
1730 next_unskippable_eager_scanned = true;
1731 break;
1732 }
1733
1734 /*
1735 * All-visible blocks are safe to skip in a normal vacuum. But
1736 * remember that the final range contains such a block for later.
1737 */
1738 *skipsallvis = true;
1739 }
1740
1741 /* write the local variables back to vacrel */
1742 vacrel->next_unskippable_block = next_unskippable_block;
1743 vacrel->next_unskippable_allvis = next_unskippable_allvis;
1744 vacrel->next_unskippable_eager_scanned = next_unskippable_eager_scanned;
1745 vacrel->next_unskippable_vmbuffer = next_unskippable_vmbuffer;
1746}
1747
1748/*
1749 * lazy_scan_new_or_empty() -- lazy_scan_heap() new/empty page handling.
1750 *
1751 * Must call here to handle both new and empty pages before calling
1752 * lazy_scan_prune or lazy_scan_noprune, since they're not prepared to deal
1753 * with new or empty pages.
1754 *
1755 * It's necessary to consider new pages as a special case, since the rules for
1756 * maintaining the visibility map and FSM with empty pages are a little
1757 * different (though new pages can be truncated away during rel truncation).
1758 *
1759 * Empty pages are not really a special case -- they're just heap pages that
1760 * have no allocated tuples (including even LP_UNUSED items). You might
1761 * wonder why we need to handle them here all the same. It's only necessary
1762 * because of a corner-case involving a hard crash during heap relation
1763 * extension. If we ever make relation-extension crash safe, then it should
1764 * no longer be necessary to deal with empty pages here (or new pages, for
1765 * that matter).
1766 *
1767 * Caller must hold at least a shared lock. We might need to escalate the
1768 * lock in that case, so the type of lock caller holds needs to be specified
1769 * using 'sharelock' argument.
1770 *
1771 * Returns false in common case where caller should go on to call
1772 * lazy_scan_prune (or lazy_scan_noprune). Otherwise returns true, indicating
1773 * that lazy_scan_heap is done processing the page, releasing lock on caller's
1774 * behalf.
1775 *
1776 * No vm_page_frozen output parameter (like that passed to lazy_scan_prune())
1777 * is passed here because neither empty nor new pages can be eagerly frozen.
1778 * New pages are never frozen. Empty pages are always set frozen in the VM at
1779 * the same time that they are set all-visible, and we don't eagerly scan
1780 * frozen pages.
1781 */
1782static bool
1784 Page page, bool sharelock, Buffer vmbuffer)
1785{
1786 Size freespace;
1787
1788 if (PageIsNew(page))
1789 {
1790 /*
1791 * All-zeroes pages can be left over if either a backend extends the
1792 * relation by a single page, but crashes before the newly initialized
1793 * page has been written out, or when bulk-extending the relation
1794 * (which creates a number of empty pages at the tail end of the
1795 * relation), and then enters them into the FSM.
1796 *
1797 * Note we do not enter the page into the visibilitymap. That has the
1798 * downside that we repeatedly visit this page in subsequent vacuums,
1799 * but otherwise we'll never discover the space on a promoted standby.
1800 * The harm of repeated checking ought to normally not be too bad. The
1801 * space usually should be used at some point, otherwise there
1802 * wouldn't be any regular vacuums.
1803 *
1804 * Make sure these pages are in the FSM, to ensure they can be reused.
1805 * Do that by testing if there's any space recorded for the page. If
1806 * not, enter it. We do so after releasing the lock on the heap page,
1807 * the FSM is approximate, after all.
1808 */
1810
1811 if (GetRecordedFreeSpace(vacrel->rel, blkno) == 0)
1812 {
1813 freespace = BLCKSZ - SizeOfPageHeaderData;
1814
1815 RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
1816 }
1817
1818 return true;
1819 }
1820
1821 if (PageIsEmpty(page))
1822 {
1823 /*
1824 * It seems likely that caller will always be able to get a cleanup
1825 * lock on an empty page. But don't take any chances -- escalate to
1826 * an exclusive lock (still don't need a cleanup lock, though).
1827 */
1828 if (sharelock)
1829 {
1832
1833 if (!PageIsEmpty(page))
1834 {
1835 /* page isn't new or empty -- keep lock and pin for now */
1836 return false;
1837 }
1838 }
1839 else
1840 {
1841 /* Already have a full cleanup lock (which is more than enough) */
1842 }
1843
1844 /*
1845 * Unlike new pages, empty pages are always set all-visible and
1846 * all-frozen.
1847 */
1848 if (!PageIsAllVisible(page))
1849 {
1850 uint8 old_vmbits;
1851
1853
1854 /* mark buffer dirty before writing a WAL record */
1856
1857 /*
1858 * It's possible that another backend has extended the heap,
1859 * initialized the page, and then failed to WAL-log the page due
1860 * to an ERROR. Since heap extension is not WAL-logged, recovery
1861 * might try to replay our record setting the page all-visible and
1862 * find that the page isn't initialized, which will cause a PANIC.
1863 * To prevent that, check whether the page has been previously
1864 * WAL-logged, and if not, do that now.
1865 */
1866 if (RelationNeedsWAL(vacrel->rel) &&
1868 log_newpage_buffer(buf, true);
1869
1870 PageSetAllVisible(page);
1871 old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
1873 vmbuffer, InvalidTransactionId,
1877
1878 /*
1879 * If the page wasn't already set all-visible and/or all-frozen in
1880 * the VM, count it as newly set for logging.
1881 */
1882 if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
1883 {
1884 vacrel->vm_new_visible_pages++;
1886 }
1887 else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0)
1888 vacrel->vm_new_frozen_pages++;
1889 }
1890
1891 freespace = PageGetHeapFreeSpace(page);
1893 RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
1894 return true;
1895 }
1896
1897 /* page isn't new or empty -- keep lock and pin */
1898 return false;
1899}
1900
1901/* qsort comparator for sorting OffsetNumbers */
1902static int
1903cmpOffsetNumbers(const void *a, const void *b)
1904{
1905 return pg_cmp_u16(*(const OffsetNumber *) a, *(const OffsetNumber *) b);
1906}
1907
1908/*
1909 * lazy_scan_prune() -- lazy_scan_heap() pruning and freezing.
1910 *
1911 * Caller must hold pin and buffer cleanup lock on the buffer.
1912 *
1913 * vmbuffer is the buffer containing the VM block with visibility information
1914 * for the heap block, blkno. all_visible_according_to_vm is the saved
1915 * visibility status of the heap block looked up earlier by the caller. We
1916 * won't rely entirely on this status, as it may be out of date.
1917 *
1918 * *has_lpdead_items is set to true or false depending on whether, upon return
1919 * from this function, any LP_DEAD items are still present on the page.
1920 *
1921 * *vm_page_frozen is set to true if the page is newly set all-frozen in the
1922 * VM. The caller currently only uses this for determining whether an eagerly
1923 * scanned page was successfully set all-frozen.
1924 */
1925static void
1927 Buffer buf,
1928 BlockNumber blkno,
1929 Page page,
1930 Buffer vmbuffer,
1931 bool all_visible_according_to_vm,
1932 bool *has_lpdead_items,
1933 bool *vm_page_frozen)
1934{
1935 Relation rel = vacrel->rel;
1936 PruneFreezeResult presult;
1937 int prune_options = 0;
1938
1939 Assert(BufferGetBlockNumber(buf) == blkno);
1940
1941 /*
1942 * Prune all HOT-update chains and potentially freeze tuples on this page.
1943 *
1944 * If the relation has no indexes, we can immediately mark would-be dead
1945 * items LP_UNUSED.
1946 *
1947 * The number of tuples removed from the page is returned in
1948 * presult.ndeleted. It should not be confused with presult.lpdead_items;
1949 * presult.lpdead_items's final value can be thought of as the number of
1950 * tuples that were deleted from indexes.
1951 *
1952 * We will update the VM after collecting LP_DEAD items and freezing
1953 * tuples. Pruning will have determined whether or not the page is
1954 * all-visible.
1955 */
1956 prune_options = HEAP_PAGE_PRUNE_FREEZE;
1957 if (vacrel->nindexes == 0)
1958 prune_options |= HEAP_PAGE_PRUNE_MARK_UNUSED_NOW;
1959
1960 heap_page_prune_and_freeze(rel, buf, vacrel->vistest, prune_options,
1961 &vacrel->cutoffs, &presult, PRUNE_VACUUM_SCAN,
1962 &vacrel->offnum,
1963 &vacrel->NewRelfrozenXid, &vacrel->NewRelminMxid);
1964
1967
1968 if (presult.nfrozen > 0)
1969 {
1970 /*
1971 * We don't increment the new_frozen_tuple_pages instrumentation
1972 * counter when nfrozen == 0, since it only counts pages with newly
1973 * frozen tuples (don't confuse that with pages newly set all-frozen
1974 * in VM).
1975 */
1976 vacrel->new_frozen_tuple_pages++;
1977 }
1978
1979 /*
1980 * VACUUM will call heap_page_is_all_visible() during the second pass over
1981 * the heap to determine all_visible and all_frozen for the page -- this
1982 * is a specialized version of the logic from this function. Now that
1983 * we've finished pruning and freezing, make sure that we're in total
1984 * agreement with heap_page_is_all_visible() using an assertion.
1985 */
1986#ifdef USE_ASSERT_CHECKING
1987 /* Note that all_frozen value does not matter when !all_visible */
1988 if (presult.all_visible)
1989 {
1990 TransactionId debug_cutoff;
1991 bool debug_all_frozen;
1992
1993 Assert(presult.lpdead_items == 0);
1994
1995 if (!heap_page_is_all_visible(vacrel, buf,
1996 &debug_cutoff, &debug_all_frozen))
1997 Assert(false);
1998
1999 Assert(presult.all_frozen == debug_all_frozen);
2000
2001 Assert(!TransactionIdIsValid(debug_cutoff) ||
2002 debug_cutoff == presult.vm_conflict_horizon);
2003 }
2004#endif
2005
2006 /*
2007 * Now save details of the LP_DEAD items from the page in vacrel
2008 */
2009 if (presult.lpdead_items > 0)
2010 {
2011 vacrel->lpdead_item_pages++;
2012
2013 /*
2014 * deadoffsets are collected incrementally in
2015 * heap_page_prune_and_freeze() as each dead line pointer is recorded,
2016 * with an indeterminate order, but dead_items_add requires them to be
2017 * sorted.
2018 */
2019 qsort(presult.deadoffsets, presult.lpdead_items, sizeof(OffsetNumber),
2021
2022 dead_items_add(vacrel, blkno, presult.deadoffsets, presult.lpdead_items);
2023 }
2024
2025 /* Finally, add page-local counts to whole-VACUUM counts */
2026 vacrel->tuples_deleted += presult.ndeleted;
2027 vacrel->tuples_frozen += presult.nfrozen;
2028 vacrel->lpdead_items += presult.lpdead_items;
2029 vacrel->live_tuples += presult.live_tuples;
2030 vacrel->recently_dead_tuples += presult.recently_dead_tuples;
2031
2032 /* Can't truncate this page */
2033 if (presult.hastup)
2034 vacrel->nonempty_pages = blkno + 1;
2035
2036 /* Did we find LP_DEAD items? */
2037 *has_lpdead_items = (presult.lpdead_items > 0);
2038
2039 Assert(!presult.all_visible || !(*has_lpdead_items));
2040
2041 /*
2042 * Handle setting visibility map bit based on information from the VM (as
2043 * of last heap_vac_scan_next_block() call), and from all_visible and
2044 * all_frozen variables
2045 */
2046 if (!all_visible_according_to_vm && presult.all_visible)
2047 {
2048 uint8 old_vmbits;
2050
2051 if (presult.all_frozen)
2052 {
2054 flags |= VISIBILITYMAP_ALL_FROZEN;
2055 }
2056
2057 /*
2058 * It should never be the case that the visibility map page is set
2059 * while the page-level bit is clear, but the reverse is allowed (if
2060 * checksums are not enabled). Regardless, set both bits so that we
2061 * get back in sync.
2062 *
2063 * NB: If the heap page is all-visible but the VM bit is not set, we
2064 * don't need to dirty the heap page. However, if checksums are
2065 * enabled, we do need to make sure that the heap page is dirtied
2066 * before passing it to visibilitymap_set(), because it may be logged.
2067 * Given that this situation should only happen in rare cases after a
2068 * crash, it is not worth optimizing.
2069 */
2070 PageSetAllVisible(page);
2072 old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
2074 vmbuffer, presult.vm_conflict_horizon,
2075 flags);
2076
2077 /*
2078 * If the page wasn't already set all-visible and/or all-frozen in the
2079 * VM, count it as newly set for logging.
2080 */
2081 if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
2082 {
2083 vacrel->vm_new_visible_pages++;
2084 if (presult.all_frozen)
2085 {
2087 *vm_page_frozen = true;
2088 }
2089 }
2090 else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
2091 presult.all_frozen)
2092 {
2093 vacrel->vm_new_frozen_pages++;
2094 *vm_page_frozen = true;
2095 }
2096 }
2097
2098 /*
2099 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
2100 * page-level bit is clear. However, it's possible that the bit got
2101 * cleared after heap_vac_scan_next_block() was called, so we must recheck
2102 * with buffer lock before concluding that the VM is corrupt.
2103 */
2104 else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
2105 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
2106 {
2107 elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
2108 vacrel->relname, blkno);
2109 visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
2111 }
2112
2113 /*
2114 * It's possible for the value returned by
2115 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
2116 * wrong for us to see tuples that appear to not be visible to everyone
2117 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
2118 * never moves backwards, but GetOldestNonRemovableTransactionId() is
2119 * conservative and sometimes returns a value that's unnecessarily small,
2120 * so if we see that contradiction it just means that the tuples that we
2121 * think are not visible to everyone yet actually are, and the
2122 * PD_ALL_VISIBLE flag is correct.
2123 *
2124 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
2125 * however.
2126 */
2127 else if (presult.lpdead_items > 0 && PageIsAllVisible(page))
2128 {
2129 elog(WARNING, "page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
2130 vacrel->relname, blkno);
2131 PageClearAllVisible(page);
2133 visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
2135 }
2136
2137 /*
2138 * If the all-visible page is all-frozen but not marked as such yet, mark
2139 * it as all-frozen. Note that all_frozen is only valid if all_visible is
2140 * true, so we must check both all_visible and all_frozen.
2141 */
2142 else if (all_visible_according_to_vm && presult.all_visible &&
2143 presult.all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
2144 {
2145 uint8 old_vmbits;
2146
2147 /*
2148 * Avoid relying on all_visible_according_to_vm as a proxy for the
2149 * page-level PD_ALL_VISIBLE bit being set, since it might have become
2150 * stale -- even when all_visible is set
2151 */
2152 if (!PageIsAllVisible(page))
2153 {
2154 PageSetAllVisible(page);
2156 }
2157
2158 /*
2159 * Set the page all-frozen (and all-visible) in the VM.
2160 *
2161 * We can pass InvalidTransactionId as our cutoff_xid, since a
2162 * snapshotConflictHorizon sufficient to make everything safe for REDO
2163 * was logged when the page's tuples were frozen.
2164 */
2166 old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
2168 vmbuffer, InvalidTransactionId,
2171
2172 /*
2173 * The page was likely already set all-visible in the VM. However,
2174 * there is a small chance that it was modified sometime between
2175 * setting all_visible_according_to_vm and checking the visibility
2176 * during pruning. Check the return value of old_vmbits anyway to
2177 * ensure the visibility map counters used for logging are accurate.
2178 */
2179 if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
2180 {
2181 vacrel->vm_new_visible_pages++;
2183 *vm_page_frozen = true;
2184 }
2185
2186 /*
2187 * We already checked that the page was not set all-frozen in the VM
2188 * above, so we don't need to test the value of old_vmbits.
2189 */
2190 else
2191 {
2192 vacrel->vm_new_frozen_pages++;
2193 *vm_page_frozen = true;
2194 }
2195 }
2196}
2197
2198/*
2199 * lazy_scan_noprune() -- lazy_scan_prune() without pruning or freezing
2200 *
2201 * Caller need only hold a pin and share lock on the buffer, unlike
2202 * lazy_scan_prune, which requires a full cleanup lock. While pruning isn't
2203 * performed here, it's quite possible that an earlier opportunistic pruning
2204 * operation left LP_DEAD items behind. We'll at least collect any such items
2205 * in dead_items for removal from indexes.
2206 *
2207 * For aggressive VACUUM callers, we may return false to indicate that a full
2208 * cleanup lock is required for processing by lazy_scan_prune. This is only
2209 * necessary when the aggressive VACUUM needs to freeze some tuple XIDs from
2210 * one or more tuples on the page. We always return true for non-aggressive
2211 * callers.
2212 *
2213 * If this function returns true, *has_lpdead_items gets set to true or false
2214 * depending on whether, upon return from this function, any LP_DEAD items are
2215 * present on the page. If this function returns false, *has_lpdead_items
2216 * is not updated.
2217 */
2218static bool
2220 Buffer buf,
2221 BlockNumber blkno,
2222 Page page,
2223 bool *has_lpdead_items)
2224{
2225 OffsetNumber offnum,
2226 maxoff;
2227 int lpdead_items,
2228 live_tuples,
2229 recently_dead_tuples,
2230 missed_dead_tuples;
2231 bool hastup;
2232 HeapTupleHeader tupleheader;
2233 TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
2234 MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
2236
2237 Assert(BufferGetBlockNumber(buf) == blkno);
2238
2239 hastup = false; /* for now */
2240
2241 lpdead_items = 0;
2242 live_tuples = 0;
2243 recently_dead_tuples = 0;
2244 missed_dead_tuples = 0;
2245
2246 maxoff = PageGetMaxOffsetNumber(page);
2247 for (offnum = FirstOffsetNumber;
2248 offnum <= maxoff;
2249 offnum = OffsetNumberNext(offnum))
2250 {
2251 ItemId itemid;
2252 HeapTupleData tuple;
2253
2254 vacrel->offnum = offnum;
2255 itemid = PageGetItemId(page, offnum);
2256
2257 if (!ItemIdIsUsed(itemid))
2258 continue;
2259
2260 if (ItemIdIsRedirected(itemid))
2261 {
2262 hastup = true;
2263 continue;
2264 }
2265
2266 if (ItemIdIsDead(itemid))
2267 {
2268 /*
2269 * Deliberately don't set hastup=true here. See same point in
2270 * lazy_scan_prune for an explanation.
2271 */
2272 deadoffsets[lpdead_items++] = offnum;
2273 continue;
2274 }
2275
2276 hastup = true; /* page prevents rel truncation */
2277 tupleheader = (HeapTupleHeader) PageGetItem(page, itemid);
2278 if (heap_tuple_should_freeze(tupleheader, &vacrel->cutoffs,
2279 &NoFreezePageRelfrozenXid,
2280 &NoFreezePageRelminMxid))
2281 {
2282 /* Tuple with XID < FreezeLimit (or MXID < MultiXactCutoff) */
2283 if (vacrel->aggressive)
2284 {
2285 /*
2286 * Aggressive VACUUMs must always be able to advance rel's
2287 * relfrozenxid to a value >= FreezeLimit (and be able to
2288 * advance rel's relminmxid to a value >= MultiXactCutoff).
2289 * The ongoing aggressive VACUUM won't be able to do that
2290 * unless it can freeze an XID (or MXID) from this tuple now.
2291 *
2292 * The only safe option is to have caller perform processing
2293 * of this page using lazy_scan_prune. Caller might have to
2294 * wait a while for a cleanup lock, but it can't be helped.
2295 */
2296 vacrel->offnum = InvalidOffsetNumber;
2297 return false;
2298 }
2299
2300 /*
2301 * Non-aggressive VACUUMs are under no obligation to advance
2302 * relfrozenxid (even by one XID). We can be much laxer here.
2303 *
2304 * Currently we always just accept an older final relfrozenxid
2305 * and/or relminmxid value. We never make caller wait or work a
2306 * little harder, even when it likely makes sense to do so.
2307 */
2308 }
2309
2310 ItemPointerSet(&(tuple.t_self), blkno, offnum);
2311 tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
2312 tuple.t_len = ItemIdGetLength(itemid);
2313 tuple.t_tableOid = RelationGetRelid(vacrel->rel);
2314
2315 switch (HeapTupleSatisfiesVacuum(&tuple, vacrel->cutoffs.OldestXmin,
2316 buf))
2317 {
2319 case HEAPTUPLE_LIVE:
2320
2321 /*
2322 * Count both cases as live, just like lazy_scan_prune
2323 */
2324 live_tuples++;
2325
2326 break;
2327 case HEAPTUPLE_DEAD:
2328
2329 /*
2330 * There is some useful work for pruning to do, that won't be
2331 * done due to failure to get a cleanup lock.
2332 */
2333 missed_dead_tuples++;
2334 break;
2336
2337 /*
2338 * Count in recently_dead_tuples, just like lazy_scan_prune
2339 */
2340 recently_dead_tuples++;
2341 break;
2343
2344 /*
2345 * Do not count these rows as live, just like lazy_scan_prune
2346 */
2347 break;
2348 default:
2349 elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
2350 break;
2351 }
2352 }
2353
2354 vacrel->offnum = InvalidOffsetNumber;
2355
2356 /*
2357 * By here we know for sure that caller can put off freezing and pruning
2358 * this particular page until the next VACUUM. Remember its details now.
2359 * (lazy_scan_prune expects a clean slate, so we have to do this last.)
2360 */
2361 vacrel->NewRelfrozenXid = NoFreezePageRelfrozenXid;
2362 vacrel->NewRelminMxid = NoFreezePageRelminMxid;
2363
2364 /* Save any LP_DEAD items found on the page in dead_items */
2365 if (vacrel->nindexes == 0)
2366 {
2367 /* Using one-pass strategy (since table has no indexes) */
2368 if (lpdead_items > 0)
2369 {
2370 /*
2371 * Perfunctory handling for the corner case where a single pass
2372 * strategy VACUUM cannot get a cleanup lock, and it turns out
2373 * that there is one or more LP_DEAD items: just count the LP_DEAD
2374 * items as missed_dead_tuples instead. (This is a bit dishonest,
2375 * but it beats having to maintain specialized heap vacuuming code
2376 * forever, for vanishingly little benefit.)
2377 */
2378 hastup = true;
2379 missed_dead_tuples += lpdead_items;
2380 }
2381 }
2382 else if (lpdead_items > 0)
2383 {
2384 /*
2385 * Page has LP_DEAD items, and so any references/TIDs that remain in
2386 * indexes will be deleted during index vacuuming (and then marked
2387 * LP_UNUSED in the heap)
2388 */
2389 vacrel->lpdead_item_pages++;
2390
2391 dead_items_add(vacrel, blkno, deadoffsets, lpdead_items);
2392
2393 vacrel->lpdead_items += lpdead_items;
2394 }
2395
2396 /*
2397 * Finally, add relevant page-local counts to whole-VACUUM counts
2398 */
2399 vacrel->live_tuples += live_tuples;
2400 vacrel->recently_dead_tuples += recently_dead_tuples;
2401 vacrel->missed_dead_tuples += missed_dead_tuples;
2402 if (missed_dead_tuples > 0)
2403 vacrel->missed_dead_pages++;
2404
2405 /* Can't truncate this page */
2406 if (hastup)
2407 vacrel->nonempty_pages = blkno + 1;
2408
2409 /* Did we find LP_DEAD items? */
2410 *has_lpdead_items = (lpdead_items > 0);
2411
2412 /* Caller won't need to call lazy_scan_prune with same page */
2413 return true;
2414}
2415
2416/*
2417 * Main entry point for index vacuuming and heap vacuuming.
2418 *
2419 * Removes items collected in dead_items from table's indexes, then marks the
2420 * same items LP_UNUSED in the heap. See the comments above lazy_scan_heap
2421 * for full details.
2422 *
2423 * Also empties dead_items, freeing up space for later TIDs.
2424 *
2425 * We may choose to bypass index vacuuming at this point, though only when the
2426 * ongoing VACUUM operation will definitely only have one index scan/round of
2427 * index vacuuming.
2428 */
2429static void
2431{
2432 bool bypass;
2433
2434 /* Should not end up here with no indexes */
2435 Assert(vacrel->nindexes > 0);
2436 Assert(vacrel->lpdead_item_pages > 0);
2437
2438 if (!vacrel->do_index_vacuuming)
2439 {
2440 Assert(!vacrel->do_index_cleanup);
2441 dead_items_reset(vacrel);
2442 return;
2443 }
2444
2445 /*
2446 * Consider bypassing index vacuuming (and heap vacuuming) entirely.
2447 *
2448 * We currently only do this in cases where the number of LP_DEAD items
2449 * for the entire VACUUM operation is close to zero. This avoids sharp
2450 * discontinuities in the duration and overhead of successive VACUUM
2451 * operations that run against the same table with a fixed workload.
2452 * Ideally, successive VACUUM operations will behave as if there are
2453 * exactly zero LP_DEAD items in cases where there are close to zero.
2454 *
2455 * This is likely to be helpful with a table that is continually affected
2456 * by UPDATEs that can mostly apply the HOT optimization, but occasionally
2457 * have small aberrations that lead to just a few heap pages retaining
2458 * only one or two LP_DEAD items. This is pretty common; even when the
2459 * DBA goes out of their way to make UPDATEs use HOT, it is practically
2460 * impossible to predict whether HOT will be applied in 100% of cases.
2461 * It's far easier to ensure that 99%+ of all UPDATEs against a table use
2462 * HOT through careful tuning.
2463 */
2464 bypass = false;
2465 if (vacrel->consider_bypass_optimization && vacrel->rel_pages > 0)
2466 {
2467 BlockNumber threshold;
2468
2469 Assert(vacrel->num_index_scans == 0);
2470 Assert(vacrel->lpdead_items == vacrel->dead_items_info->num_items);
2471 Assert(vacrel->do_index_vacuuming);
2472 Assert(vacrel->do_index_cleanup);
2473
2474 /*
2475 * This crossover point at which we'll start to do index vacuuming is
2476 * expressed as a percentage of the total number of heap pages in the
2477 * table that are known to have at least one LP_DEAD item. This is
2478 * much more important than the total number of LP_DEAD items, since
2479 * it's a proxy for the number of heap pages whose visibility map bits
2480 * cannot be set on account of bypassing index and heap vacuuming.
2481 *
2482 * We apply one further precautionary test: the space currently used
2483 * to store the TIDs (TIDs that now all point to LP_DEAD items) must
2484 * not exceed 32MB. This limits the risk that we will bypass index
2485 * vacuuming again and again until eventually there is a VACUUM whose
2486 * dead_items space is not CPU cache resident.
2487 *
2488 * We don't take any special steps to remember the LP_DEAD items (such
2489 * as counting them in our final update to the stats system) when the
2490 * optimization is applied. Though the accounting used in analyze.c's
2491 * acquire_sample_rows() will recognize the same LP_DEAD items as dead
2492 * rows in its own stats report, that's okay. The discrepancy should
2493 * be negligible. If this optimization is ever expanded to cover more
2494 * cases then this may need to be reconsidered.
2495 */
2496 threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES;
2497 bypass = (vacrel->lpdead_item_pages < threshold &&
2498 TidStoreMemoryUsage(vacrel->dead_items) < 32 * 1024 * 1024);
2499 }
2500
2501 if (bypass)
2502 {
2503 /*
2504 * There are almost zero TIDs. Behave as if there were precisely
2505 * zero: bypass index vacuuming, but do index cleanup.
2506 *
2507 * We expect that the ongoing VACUUM operation will finish very
2508 * quickly, so there is no point in considering speeding up as a
2509 * failsafe against wraparound failure. (Index cleanup is expected to
2510 * finish very quickly in cases where there were no ambulkdelete()
2511 * calls.)
2512 */
2513 vacrel->do_index_vacuuming = false;
2514 }
2515 else if (lazy_vacuum_all_indexes(vacrel))
2516 {
2517 /*
2518 * We successfully completed a round of index vacuuming. Do related
2519 * heap vacuuming now.
2520 */
2521 lazy_vacuum_heap_rel(vacrel);
2522 }
2523 else
2524 {
2525 /*
2526 * Failsafe case.
2527 *
2528 * We attempted index vacuuming, but didn't finish a full round/full
2529 * index scan. This happens when relfrozenxid or relminmxid is too
2530 * far in the past.
2531 *
2532 * From this point on the VACUUM operation will do no further index
2533 * vacuuming or heap vacuuming. This VACUUM operation won't end up
2534 * back here again.
2535 */
2537 }
2538
2539 /*
2540 * Forget the LP_DEAD items that we just vacuumed (or just decided to not
2541 * vacuum)
2542 */
2543 dead_items_reset(vacrel);
2544}
2545
2546/*
2547 * lazy_vacuum_all_indexes() -- Main entry for index vacuuming
2548 *
2549 * Returns true in the common case when all indexes were successfully
2550 * vacuumed. Returns false in rare cases where we determined that the ongoing
2551 * VACUUM operation is at risk of taking too long to finish, leading to
2552 * wraparound failure.
2553 */
2554static bool
2556{
2557 bool allindexes = true;
2558 double old_live_tuples = vacrel->rel->rd_rel->reltuples;
2559 const int progress_start_index[] = {
2562 };
2563 const int progress_end_index[] = {
2567 };
2568 int64 progress_start_val[2];
2569 int64 progress_end_val[3];
2570
2571 Assert(vacrel->nindexes > 0);
2572 Assert(vacrel->do_index_vacuuming);
2573 Assert(vacrel->do_index_cleanup);
2574
2575 /* Precheck for XID wraparound emergencies */
2577 {
2578 /* Wraparound emergency -- don't even start an index scan */
2579 return false;
2580 }
2581
2582 /*
2583 * Report that we are now vacuuming indexes and the number of indexes to
2584 * vacuum.
2585 */
2586 progress_start_val[0] = PROGRESS_VACUUM_PHASE_VACUUM_INDEX;
2587 progress_start_val[1] = vacrel->nindexes;
2588 pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val);
2589
2590 if (!ParallelVacuumIsActive(vacrel))
2591 {
2592 for (int idx = 0; idx < vacrel->nindexes; idx++)
2593 {
2594 Relation indrel = vacrel->indrels[idx];
2595 IndexBulkDeleteResult *istat = vacrel->indstats[idx];
2596
2597 vacrel->indstats[idx] = lazy_vacuum_one_index(indrel, istat,
2598 old_live_tuples,
2599 vacrel);
2600
2601 /* Report the number of indexes vacuumed */
2603 idx + 1);
2604
2606 {
2607 /* Wraparound emergency -- end current index scan */
2608 allindexes = false;
2609 break;
2610 }
2611 }
2612 }
2613 else
2614 {
2615 /* Outsource everything to parallel variant */
2616 parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
2617 vacrel->num_index_scans);
2618
2619 /*
2620 * Do a postcheck to consider applying wraparound failsafe now. Note
2621 * that parallel VACUUM only gets the precheck and this postcheck.
2622 */
2624 allindexes = false;
2625 }
2626
2627 /*
2628 * We delete all LP_DEAD items from the first heap pass in all indexes on
2629 * each call here (except calls where we choose to do the failsafe). This
2630 * makes the next call to lazy_vacuum_heap_rel() safe (except in the event
2631 * of the failsafe triggering, which prevents the next call from taking
2632 * place).
2633 */
2634 Assert(vacrel->num_index_scans > 0 ||
2635 vacrel->dead_items_info->num_items == vacrel->lpdead_items);
2636 Assert(allindexes || VacuumFailsafeActive);
2637
2638 /*
2639 * Increase and report the number of index scans. Also, we reset
2640 * PROGRESS_VACUUM_INDEXES_TOTAL and PROGRESS_VACUUM_INDEXES_PROCESSED.
2641 *
2642 * We deliberately include the case where we started a round of bulk
2643 * deletes that we weren't able to finish due to the failsafe triggering.
2644 */
2645 vacrel->num_index_scans++;
2646 progress_end_val[0] = 0;
2647 progress_end_val[1] = 0;
2648 progress_end_val[2] = vacrel->num_index_scans;
2649 pgstat_progress_update_multi_param(3, progress_end_index, progress_end_val);
2650
2651 return allindexes;
2652}
2653
2654/*
2655 * Read stream callback for vacuum's third phase (second pass over the heap).
2656 * Gets the next block from the TID store and returns it or InvalidBlockNumber
2657 * if there are no further blocks to vacuum.
2658 */
2659static BlockNumber
2661 void *callback_private_data,
2662 void *per_buffer_data)
2663{
2664 TidStoreIter *iter = callback_private_data;
2665 TidStoreIterResult *iter_result;
2666
2667 iter_result = TidStoreIterateNext(iter);
2668 if (iter_result == NULL)
2669 return InvalidBlockNumber;
2670
2671 /*
2672 * Save the TidStoreIterResult for later, so we can extract the offsets.
2673 * It is safe to copy the result, according to TidStoreIterateNext().
2674 */
2675 memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
2676
2677 return iter_result->blkno;
2678}
2679
2680/*
2681 * lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
2682 *
2683 * This routine marks LP_DEAD items in vacrel->dead_items as LP_UNUSED. Pages
2684 * that never had lazy_scan_prune record LP_DEAD items are not visited at all.
2685 *
2686 * We may also be able to truncate the line pointer array of the heap pages we
2687 * visit. If there is a contiguous group of LP_UNUSED items at the end of the
2688 * array, it can be reclaimed as free space. These LP_UNUSED items usually
2689 * start out as LP_DEAD items recorded by lazy_scan_prune (we set items from
2690 * each page to LP_UNUSED, and then consider if it's possible to truncate the
2691 * page's line pointer array).
2692 *
2693 * Note: the reason for doing this as a second pass is we cannot remove the
2694 * tuples until we've removed their index entries, and we want to process
2695 * index entry removal in batches as large as possible.
2696 */
2697static void
2699{
2700 ReadStream *stream;
2701 BlockNumber vacuumed_pages = 0;
2702 Buffer vmbuffer = InvalidBuffer;
2703 LVSavedErrInfo saved_err_info;
2704 TidStoreIter *iter;
2705
2706 Assert(vacrel->do_index_vacuuming);
2707 Assert(vacrel->do_index_cleanup);
2708 Assert(vacrel->num_index_scans > 0);
2709
2710 /* Report that we are now vacuuming the heap */
2713
2714 /* Update error traceback information */
2715 update_vacuum_error_info(vacrel, &saved_err_info,
2718
2719 iter = TidStoreBeginIterate(vacrel->dead_items);
2720
2721 /* Set up the read stream for vacuum's second pass through the heap */
2723 vacrel->bstrategy,
2724 vacrel->rel,
2727 iter,
2728 sizeof(TidStoreIterResult));
2729
2730 while (true)
2731 {
2732 BlockNumber blkno;
2733 Buffer buf;
2734 Page page;
2735 TidStoreIterResult *iter_result;
2736 Size freespace;
2738 int num_offsets;
2739
2740 vacuum_delay_point(false);
2741
2742 buf = read_stream_next_buffer(stream, (void **) &iter_result);
2743
2744 /* The relation is exhausted */
2745 if (!BufferIsValid(buf))
2746 break;
2747
2748 vacrel->blkno = blkno = BufferGetBlockNumber(buf);
2749
2750 Assert(iter_result);
2751 num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
2752 Assert(num_offsets <= lengthof(offsets));
2753
2754 /*
2755 * Pin the visibility map page in case we need to mark the page
2756 * all-visible. In most cases this will be very cheap, because we'll
2757 * already have the correct page pinned anyway.
2758 */
2759 visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
2760
2761 /* We need a non-cleanup exclusive lock to mark dead_items unused */
2763 lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
2764 num_offsets, vmbuffer);
2765
2766 /* Now that we've vacuumed the page, record its available space */
2767 page = BufferGetPage(buf);
2768 freespace = PageGetHeapFreeSpace(page);
2769
2771 RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
2772 vacuumed_pages++;
2773 }
2774
2775 read_stream_end(stream);
2776 TidStoreEndIterate(iter);
2777
2778 vacrel->blkno = InvalidBlockNumber;
2779 if (BufferIsValid(vmbuffer))
2780 ReleaseBuffer(vmbuffer);
2781
2782 /*
2783 * We set all LP_DEAD items from the first heap pass to LP_UNUSED during
2784 * the second heap pass. No more, no less.
2785 */
2786 Assert(vacrel->num_index_scans > 1 ||
2787 (vacrel->dead_items_info->num_items == vacrel->lpdead_items &&
2788 vacuumed_pages == vacrel->lpdead_item_pages));
2789
2791 (errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
2792 vacrel->relname, (long long) vacrel->dead_items_info->num_items,
2793 vacuumed_pages)));
2794
2795 /* Revert to the previous phase information for error traceback */
2796 restore_vacuum_error_info(vacrel, &saved_err_info);
2797}
2798
2799/*
2800 * lazy_vacuum_heap_page() -- free page's LP_DEAD items listed in the
2801 * vacrel->dead_items store.
2802 *
2803 * Caller must have an exclusive buffer lock on the buffer (though a full
2804 * cleanup lock is also acceptable). vmbuffer must be valid and already have
2805 * a pin on blkno's visibility map page.
2806 */
2807static void
2809 OffsetNumber *deadoffsets, int num_offsets,
2810 Buffer vmbuffer)
2811{
2812 Page page = BufferGetPage(buffer);
2814 int nunused = 0;
2815 TransactionId visibility_cutoff_xid;
2816 bool all_frozen;
2817 LVSavedErrInfo saved_err_info;
2818
2819 Assert(vacrel->do_index_vacuuming);
2820
2822
2823 /* Update error traceback information */
2824 update_vacuum_error_info(vacrel, &saved_err_info,
2827
2829
2830 for (int i = 0; i < num_offsets; i++)
2831 {
2832 ItemId itemid;
2833 OffsetNumber toff = deadoffsets[i];
2834
2835 itemid = PageGetItemId(page, toff);
2836
2837 Assert(ItemIdIsDead(itemid) && !ItemIdHasStorage(itemid));
2838 ItemIdSetUnused(itemid);
2839 unused[nunused++] = toff;
2840 }
2841
2842 Assert(nunused > 0);
2843
2844 /* Attempt to truncate line pointer array now */
2846
2847 /*
2848 * Mark buffer dirty before we write WAL.
2849 */
2850 MarkBufferDirty(buffer);
2851
2852 /* XLOG stuff */
2853 if (RelationNeedsWAL(vacrel->rel))
2854 {
2855 log_heap_prune_and_freeze(vacrel->rel, buffer,
2857 false, /* no cleanup lock required */
2859 NULL, 0, /* frozen */
2860 NULL, 0, /* redirected */
2861 NULL, 0, /* dead */
2862 unused, nunused);
2863 }
2864
2865 /*
2866 * End critical section, so we safely can do visibility tests (which
2867 * possibly need to perform IO and allocate memory!). If we crash now the
2868 * page (including the corresponding vm bit) might not be marked all
2869 * visible, but that's fine. A later vacuum will fix that.
2870 */
2872
2873 /*
2874 * Now that we have removed the LP_DEAD items from the page, once again
2875 * check if the page has become all-visible. The page is already marked
2876 * dirty, exclusively locked, and, if needed, a full page image has been
2877 * emitted.
2878 */
2879 Assert(!PageIsAllVisible(page));
2880 if (heap_page_is_all_visible(vacrel, buffer, &visibility_cutoff_xid,
2881 &all_frozen))
2882 {
2883 uint8 old_vmbits;
2885
2886 if (all_frozen)
2887 {
2888 Assert(!TransactionIdIsValid(visibility_cutoff_xid));
2889 flags |= VISIBILITYMAP_ALL_FROZEN;
2890 }
2891
2892 PageSetAllVisible(page);
2893 old_vmbits = visibilitymap_set(vacrel->rel, blkno, buffer,
2895 vmbuffer, visibility_cutoff_xid,
2896 flags);
2897
2898 /*
2899 * If the page wasn't already set all-visible and/or all-frozen in the
2900 * VM, count it as newly set for logging.
2901 */
2902 if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
2903 {
2904 vacrel->vm_new_visible_pages++;
2905 if (all_frozen)
2907 }
2908
2909 else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
2910 all_frozen)
2911 vacrel->vm_new_frozen_pages++;
2912 }
2913
2914 /* Revert to the previous phase information for error traceback */
2915 restore_vacuum_error_info(vacrel, &saved_err_info);
2916}
2917
2918/*
2919 * Trigger the failsafe to avoid wraparound failure when vacrel table has a
2920 * relfrozenxid and/or relminmxid that is dangerously far in the past.
2921 * Triggering the failsafe makes the ongoing VACUUM bypass any further index
2922 * vacuuming and heap vacuuming. Truncating the heap is also bypassed.
2923 *
2924 * Any remaining work (work that VACUUM cannot just bypass) is typically sped
2925 * up when the failsafe triggers. VACUUM stops applying any cost-based delay
2926 * that it started out with.
2927 *
2928 * Returns true when failsafe has been triggered.
2929 */
2930static bool
2932{
2933 /* Don't warn more than once per VACUUM */
2935 return true;
2936
2938 {
2939 const int progress_index[] = {
2942 };
2943 int64 progress_val[2] = {0, 0};
2944
2945 VacuumFailsafeActive = true;
2946
2947 /*
2948 * Abandon use of a buffer access strategy to allow use of all of
2949 * shared buffers. We assume the caller who allocated the memory for
2950 * the BufferAccessStrategy will free it.
2951 */
2952 vacrel->bstrategy = NULL;
2953
2954 /* Disable index vacuuming, index cleanup, and heap rel truncation */
2955 vacrel->do_index_vacuuming = false;
2956 vacrel->do_index_cleanup = false;
2957 vacrel->do_rel_truncate = false;
2958
2959 /* Reset the progress counters */
2960 pgstat_progress_update_multi_param(2, progress_index, progress_val);
2961
2963 (errmsg("bypassing nonessential maintenance of table \"%s.%s.%s\" as a failsafe after %d index scans",
2964 vacrel->dbname, vacrel->relnamespace, vacrel->relname,
2965 vacrel->num_index_scans),
2966 errdetail("The table's relfrozenxid or relminmxid is too far in the past."),
2967 errhint("Consider increasing configuration parameter \"maintenance_work_mem\" or \"autovacuum_work_mem\".\n"
2968 "You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.")));
2969
2970 /* Stop applying cost limits from this point on */
2971 VacuumCostActive = false;
2973
2974 return true;
2975 }
2976
2977 return false;
2978}
2979
2980/*
2981 * lazy_cleanup_all_indexes() -- cleanup all indexes of relation.
2982 */
2983static void
2985{
2986 double reltuples = vacrel->new_rel_tuples;
2987 bool estimated_count = vacrel->scanned_pages < vacrel->rel_pages;
2988 const int progress_start_index[] = {
2991 };
2992 const int progress_end_index[] = {
2995 };
2996 int64 progress_start_val[2];
2997 int64 progress_end_val[2] = {0, 0};
2998
2999 Assert(vacrel->do_index_cleanup);
3000 Assert(vacrel->nindexes > 0);
3001
3002 /*
3003 * Report that we are now cleaning up indexes and the number of indexes to
3004 * cleanup.
3005 */
3006 progress_start_val[0] = PROGRESS_VACUUM_PHASE_INDEX_CLEANUP;
3007 progress_start_val[1] = vacrel->nindexes;
3008 pgstat_progress_update_multi_param(2, progress_start_index, progress_start_val);
3009
3010 if (!ParallelVacuumIsActive(vacrel))
3011 {
3012 for (int idx = 0; idx < vacrel->nindexes; idx++)
3013 {
3014 Relation indrel = vacrel->indrels[idx];
3015 IndexBulkDeleteResult *istat = vacrel->indstats[idx];
3016
3017 vacrel->indstats[idx] =
3018 lazy_cleanup_one_index(indrel, istat, reltuples,
3019 estimated_count, vacrel);
3020
3021 /* Report the number of indexes cleaned up */
3023 idx + 1);
3024 }
3025 }
3026 else
3027 {
3028 /* Outsource everything to parallel variant */
3029 parallel_vacuum_cleanup_all_indexes(vacrel->pvs, reltuples,
3030 vacrel->num_index_scans,
3031 estimated_count);
3032 }
3033
3034 /* Reset the progress counters */
3035 pgstat_progress_update_multi_param(2, progress_end_index, progress_end_val);
3036}
3037
3038/*
3039 * lazy_vacuum_one_index() -- vacuum index relation.
3040 *
3041 * Delete all the index tuples containing a TID collected in
3042 * vacrel->dead_items. Also update running statistics. Exact
3043 * details depend on index AM's ambulkdelete routine.
3044 *
3045 * reltuples is the number of heap tuples to be passed to the
3046 * bulkdelete callback. It's always assumed to be estimated.
3047 * See indexam.sgml for more info.
3048 *
3049 * Returns bulk delete stats derived from input stats
3050 */
3051static IndexBulkDeleteResult *
3053 double reltuples, LVRelState *vacrel)
3054{
3055 IndexVacuumInfo ivinfo;
3056 LVSavedErrInfo saved_err_info;
3057
3058 ivinfo.index = indrel;
3059 ivinfo.heaprel = vacrel->rel;
3060 ivinfo.analyze_only = false;
3061 ivinfo.report_progress = false;
3062 ivinfo.estimated_count = true;
3063 ivinfo.message_level = DEBUG2;
3064 ivinfo.num_heap_tuples = reltuples;
3065 ivinfo.strategy = vacrel->bstrategy;
3066
3067 /*
3068 * Update error traceback information.
3069 *
3070 * The index name is saved during this phase and restored immediately
3071 * after this phase. See vacuum_error_callback.
3072 */
3073 Assert(vacrel->indname == NULL);
3074 vacrel->indname = pstrdup(RelationGetRelationName(indrel));
3075 update_vacuum_error_info(vacrel, &saved_err_info,
3078
3079 /* Do bulk deletion */
3080 istat = vac_bulkdel_one_index(&ivinfo, istat, vacrel->dead_items,
3081 vacrel->dead_items_info);
3082
3083 /* Revert to the previous phase information for error traceback */
3084 restore_vacuum_error_info(vacrel, &saved_err_info);
3085 pfree(vacrel->indname);
3086 vacrel->indname = NULL;
3087
3088 return istat;
3089}
3090
3091/*
3092 * lazy_cleanup_one_index() -- do post-vacuum cleanup for index relation.
3093 *
3094 * Calls index AM's amvacuumcleanup routine. reltuples is the number
3095 * of heap tuples and estimated_count is true if reltuples is an
3096 * estimated value. See indexam.sgml for more info.
3097 *
3098 * Returns bulk delete stats derived from input stats
3099 */
3100static IndexBulkDeleteResult *
3102 double reltuples, bool estimated_count,
3103 LVRelState *vacrel)
3104{
3105 IndexVacuumInfo ivinfo;
3106 LVSavedErrInfo saved_err_info;
3107
3108 ivinfo.index = indrel;
3109 ivinfo.heaprel = vacrel->rel;
3110 ivinfo.analyze_only = false;
3111 ivinfo.report_progress = false;
3112 ivinfo.estimated_count = estimated_count;
3113 ivinfo.message_level = DEBUG2;
3114
3115 ivinfo.num_heap_tuples = reltuples;
3116 ivinfo.strategy = vacrel->bstrategy;
3117
3118 /*
3119 * Update error traceback information.
3120 *
3121 * The index name is saved during this phase and restored immediately
3122 * after this phase. See vacuum_error_callback.
3123 */
3124 Assert(vacrel->indname == NULL);
3125 vacrel->indname = pstrdup(RelationGetRelationName(indrel));
3126 update_vacuum_error_info(vacrel, &saved_err_info,
3129
3130 istat = vac_cleanup_one_index(&ivinfo, istat);
3131
3132 /* Revert to the previous phase information for error traceback */
3133 restore_vacuum_error_info(vacrel, &saved_err_info);
3134 pfree(vacrel->indname);
3135 vacrel->indname = NULL;
3136
3137 return istat;
3138}
3139
3140/*
3141 * should_attempt_truncation - should we attempt to truncate the heap?
3142 *
3143 * Don't even think about it unless we have a shot at releasing a goodly
3144 * number of pages. Otherwise, the time taken isn't worth it, mainly because
3145 * an AccessExclusive lock must be replayed on any hot standby, where it can
3146 * be particularly disruptive.
3147 *
3148 * Also don't attempt it if wraparound failsafe is in effect. The entire
3149 * system might be refusing to allocate new XIDs at this point. The system
3150 * definitely won't return to normal unless and until VACUUM actually advances
3151 * the oldest relfrozenxid -- which hasn't happened for target rel just yet.
3152 * If lazy_truncate_heap attempted to acquire an AccessExclusiveLock to
3153 * truncate the table under these circumstances, an XID exhaustion error might
3154 * make it impossible for VACUUM to fix the underlying XID exhaustion problem.
3155 * There is very little chance of truncation working out when the failsafe is
3156 * in effect in any case. lazy_scan_prune makes the optimistic assumption
3157 * that any LP_DEAD items it encounters will always be LP_UNUSED by the time
3158 * we're called.
3159 */
3160static bool
3162{
3163 BlockNumber possibly_freeable;
3164
3165 if (!vacrel->do_rel_truncate || VacuumFailsafeActive)
3166 return false;
3167
3168 possibly_freeable = vacrel->rel_pages - vacrel->nonempty_pages;
3169 if (possibly_freeable > 0 &&
3170 (possibly_freeable >= REL_TRUNCATE_MINIMUM ||
3171 possibly_freeable >= vacrel->rel_pages / REL_TRUNCATE_FRACTION))
3172 return true;
3173
3174 return false;
3175}
3176
3177/*
3178 * lazy_truncate_heap - try to truncate off any empty pages at the end
3179 */
3180static void
3182{
3183 BlockNumber orig_rel_pages = vacrel->rel_pages;
3184 BlockNumber new_rel_pages;
3185 bool lock_waiter_detected;
3186 int lock_retry;
3187
3188 /* Report that we are now truncating */
3191
3192 /* Update error traceback information one last time */
3195
3196 /*
3197 * Loop until no more truncating can be done.
3198 */
3199 do
3200 {
3201 /*
3202 * We need full exclusive lock on the relation in order to do
3203 * truncation. If we can't get it, give up rather than waiting --- we
3204 * don't want to block other backends, and we don't want to deadlock
3205 * (which is quite possible considering we already hold a lower-grade
3206 * lock).
3207 */
3208 lock_waiter_detected = false;
3209 lock_retry = 0;
3210 while (true)
3211 {
3213 break;
3214
3215 /*
3216 * Check for interrupts while trying to (re-)acquire the exclusive
3217 * lock.
3218 */
3220
3221 if (++lock_retry > (VACUUM_TRUNCATE_LOCK_TIMEOUT /
3223 {
3224 /*
3225 * We failed to establish the lock in the specified number of
3226 * retries. This means we give up truncating.
3227 */
3228 ereport(vacrel->verbose ? INFO : DEBUG2,
3229 (errmsg("\"%s\": stopping truncate due to conflicting lock request",
3230 vacrel->relname)));
3231 return;
3232 }
3233
3234 (void) WaitLatch(MyLatch,
3237 WAIT_EVENT_VACUUM_TRUNCATE);
3239 }
3240
3241 /*
3242 * Now that we have exclusive lock, look to see if the rel has grown
3243 * whilst we were vacuuming with non-exclusive lock. If so, give up;
3244 * the newly added pages presumably contain non-deletable tuples.
3245 */
3246 new_rel_pages = RelationGetNumberOfBlocks(vacrel->rel);
3247 if (new_rel_pages != orig_rel_pages)
3248 {
3249 /*
3250 * Note: we intentionally don't update vacrel->rel_pages with the
3251 * new rel size here. If we did, it would amount to assuming that
3252 * the new pages are empty, which is unlikely. Leaving the numbers
3253 * alone amounts to assuming that the new pages have the same
3254 * tuple density as existing ones, which is less unlikely.
3255 */
3257 return;
3258 }
3259
3260 /*
3261 * Scan backwards from the end to verify that the end pages actually
3262 * contain no tuples. This is *necessary*, not optional, because
3263 * other backends could have added tuples to these pages whilst we
3264 * were vacuuming.
3265 */
3266 new_rel_pages = count_nondeletable_pages(vacrel, &lock_waiter_detected);
3267 vacrel->blkno = new_rel_pages;
3268
3269 if (new_rel_pages >= orig_rel_pages)
3270 {
3271 /* can't do anything after all */
3273 return;
3274 }
3275
3276 /*
3277 * Okay to truncate.
3278 */
3279 RelationTruncate(vacrel->rel, new_rel_pages);
3280
3281 /*
3282 * We can release the exclusive lock as soon as we have truncated.
3283 * Other backends can't safely access the relation until they have
3284 * processed the smgr invalidation that smgrtruncate sent out ... but
3285 * that should happen as part of standard invalidation processing once
3286 * they acquire lock on the relation.
3287 */
3289
3290 /*
3291 * Update statistics. Here, it *is* correct to adjust rel_pages
3292 * without also touching reltuples, since the tuple count wasn't
3293 * changed by the truncation.
3294 */
3295 vacrel->removed_pages += orig_rel_pages - new_rel_pages;
3296 vacrel->rel_pages = new_rel_pages;
3297
3298 ereport(vacrel->verbose ? INFO : DEBUG2,
3299 (errmsg("table \"%s\": truncated %u to %u pages",
3300 vacrel->relname,
3301 orig_rel_pages, new_rel_pages)));
3302 orig_rel_pages = new_rel_pages;
3303 } while (new_rel_pages > vacrel->nonempty_pages && lock_waiter_detected);
3304}
3305
3306/*
3307 * Rescan end pages to verify that they are (still) empty of tuples.
3308 *
3309 * Returns number of nondeletable pages (last nonempty page + 1).
3310 */
3311static BlockNumber
3312count_nondeletable_pages(LVRelState *vacrel, bool *lock_waiter_detected)
3313{
3314 BlockNumber blkno;
3315 BlockNumber prefetchedUntil;
3316 instr_time starttime;
3317
3318 /* Initialize the starttime if we check for conflicting lock requests */
3319 INSTR_TIME_SET_CURRENT(starttime);
3320
3321 /*
3322 * Start checking blocks at what we believe relation end to be and move
3323 * backwards. (Strange coding of loop control is needed because blkno is
3324 * unsigned.) To make the scan faster, we prefetch a few blocks at a time
3325 * in forward direction, so that OS-level readahead can kick in.
3326 */
3327 blkno = vacrel->rel_pages;
3329 "prefetch size must be power of 2");
3330 prefetchedUntil = InvalidBlockNumber;
3331 while (blkno > vacrel->nonempty_pages)
3332 {
3333 Buffer buf;
3334 Page page;
3335 OffsetNumber offnum,
3336 maxoff;
3337 bool hastup;
3338
3339 /*
3340 * Check if another process requests a lock on our relation. We are
3341 * holding an AccessExclusiveLock here, so they will be waiting. We
3342 * only do this once per VACUUM_TRUNCATE_LOCK_CHECK_INTERVAL, and we
3343 * only check if that interval has elapsed once every 32 blocks to
3344 * keep the number of system calls and actual shared lock table
3345 * lookups to a minimum.
3346 */
3347 if ((blkno % 32) == 0)
3348 {
3349 instr_time currenttime;
3350 instr_time elapsed;
3351
3352 INSTR_TIME_SET_CURRENT(currenttime);
3353 elapsed = currenttime;
3354 INSTR_TIME_SUBTRACT(elapsed, starttime);
3355 if ((INSTR_TIME_GET_MICROSEC(elapsed) / 1000)
3357 {
3359 {
3360 ereport(vacrel->verbose ? INFO : DEBUG2,
3361 (errmsg("table \"%s\": suspending truncate due to conflicting lock request",
3362 vacrel->relname)));
3363
3364 *lock_waiter_detected = true;
3365 return blkno;
3366 }
3367 starttime = currenttime;
3368 }
3369 }
3370
3371 /*
3372 * We don't insert a vacuum delay point here, because we have an
3373 * exclusive lock on the table which we want to hold for as short a
3374 * time as possible. We still need to check for interrupts however.
3375 */
3377
3378 blkno--;
3379
3380 /* If we haven't prefetched this lot yet, do so now. */
3381 if (prefetchedUntil > blkno)
3382 {
3383 BlockNumber prefetchStart;
3384 BlockNumber pblkno;
3385
3386 prefetchStart = blkno & ~(PREFETCH_SIZE - 1);
3387 for (pblkno = prefetchStart; pblkno <= blkno; pblkno++)
3388 {
3389 PrefetchBuffer(vacrel->rel, MAIN_FORKNUM, pblkno);
3391 }
3392 prefetchedUntil = prefetchStart;
3393 }
3394
3396 vacrel->bstrategy);
3397
3398 /* In this phase we only need shared access to the buffer */
3400
3401 page = BufferGetPage(buf);
3402
3403 if (PageIsNew(page) || PageIsEmpty(page))
3404 {
3406 continue;
3407 }
3408
3409 hastup = false;
3410 maxoff = PageGetMaxOffsetNumber(page);
3411 for (offnum = FirstOffsetNumber;
3412 offnum <= maxoff;
3413 offnum = OffsetNumberNext(offnum))
3414 {
3415 ItemId itemid;
3416
3417 itemid = PageGetItemId(page, offnum);
3418
3419 /*
3420 * Note: any non-unused item should be taken as a reason to keep
3421 * this page. Even an LP_DEAD item makes truncation unsafe, since
3422 * we must not have cleaned out its index entries.
3423 */
3424 if (ItemIdIsUsed(itemid))
3425 {
3426 hastup = true;
3427 break; /* can stop scanning */
3428 }
3429 } /* scan along page */
3430
3432
3433 /* Done scanning if we found a tuple here */
3434 if (hastup)
3435 return blkno + 1;
3436 }
3437
3438 /*
3439 * If we fall out of the loop, all the previously-thought-to-be-empty
3440 * pages still are; we need not bother to look at the last known-nonempty
3441 * page.
3442 */
3443 return vacrel->nonempty_pages;
3444}
3445
3446/*
3447 * Allocate dead_items and dead_items_info (either using palloc, or in dynamic
3448 * shared memory). Sets both in vacrel for caller.
3449 *
3450 * Also handles parallel initialization as part of allocating dead_items in
3451 * DSM when required.
3452 */
3453static void
3454dead_items_alloc(LVRelState *vacrel, int nworkers)
3455{
3456 VacDeadItemsInfo *dead_items_info;
3457 int vac_work_mem = AmAutoVacuumWorkerProcess() &&
3458 autovacuum_work_mem != -1 ?
3460
3461 /*
3462 * Initialize state for a parallel vacuum. As of now, only one worker can
3463 * be used for an index, so we invoke parallelism only if there are at
3464 * least two indexes on a table.
3465 */
3466 if (nworkers >= 0 && vacrel->nindexes > 1 && vacrel->do_index_vacuuming)
3467 {
3468 /*
3469 * Since parallel workers cannot access data in temporary tables, we
3470 * can't perform parallel vacuum on them.
3471 */
3472 if (RelationUsesLocalBuffers(vacrel->rel))
3473 {
3474 /*
3475 * Give warning only if the user explicitly tries to perform a
3476 * parallel vacuum on the temporary table.
3477 */
3478 if (nworkers > 0)
3480 (errmsg("disabling parallel option of vacuum on \"%s\" --- cannot vacuum temporary tables in parallel",
3481 vacrel->relname)));
3482 }
3483 else
3484 vacrel->pvs = parallel_vacuum_init(vacrel->rel, vacrel->indrels,
3485 vacrel->nindexes, nworkers,
3486 vac_work_mem,
3487 vacrel->verbose ? INFO : DEBUG2,
3488 vacrel->bstrategy);
3489
3490 /*
3491 * If parallel mode started, dead_items and dead_items_info spaces are
3492 * allocated in DSM.
3493 */
3494 if (ParallelVacuumIsActive(vacrel))
3495 {
3497 &vacrel->dead_items_info);
3498 return;
3499 }
3500 }
3501
3502 /*
3503 * Serial VACUUM case. Allocate both dead_items and dead_items_info
3504 * locally.
3505 */
3506
3507 dead_items_info = (VacDeadItemsInfo *) palloc(sizeof(VacDeadItemsInfo));
3508 dead_items_info->max_bytes = vac_work_mem * (Size) 1024;
3509 dead_items_info->num_items = 0;
3510 vacrel->dead_items_info = dead_items_info;
3511
3512 vacrel->dead_items = TidStoreCreateLocal(dead_items_info->max_bytes, true);
3513}
3514
3515/*
3516 * Add the given block number and offset numbers to dead_items.
3517 */
3518static void
3520 int num_offsets)
3521{
3522 const int prog_index[2] = {
3525 };
3526 int64 prog_val[2];
3527
3528 TidStoreSetBlockOffsets(vacrel->dead_items, blkno, offsets, num_offsets);
3529 vacrel->dead_items_info->num_items += num_offsets;
3530
3531 /* update the progress information */
3532 prog_val[0] = vacrel->dead_items_info->num_items;
3533 prog_val[1] = TidStoreMemoryUsage(vacrel->dead_items);
3534 pgstat_progress_update_multi_param(2, prog_index, prog_val);
3535}
3536
3537/*
3538 * Forget all collected dead items.
3539 */
3540static void
3542{
3543 if (ParallelVacuumIsActive(vacrel))
3544 {
3546 return;
3547 }
3548
3549 /* Recreate the tidstore with the same max_bytes limitation */
3550 TidStoreDestroy(vacrel->dead_items);
3551 vacrel->dead_items = TidStoreCreateLocal(vacrel->dead_items_info->max_bytes, true);
3552
3553 /* Reset the counter */
3554 vacrel->dead_items_info->num_items = 0;
3555}
3556
3557/*
3558 * Perform cleanup for resources allocated in dead_items_alloc
3559 */
3560static void
3562{
3563 if (!ParallelVacuumIsActive(vacrel))
3564 {
3565 /* Don't bother with pfree here */
3566 return;
3567 }
3568
3569 /* End parallel mode */
3570 parallel_vacuum_end(vacrel->pvs, vacrel->indstats);
3571 vacrel->pvs = NULL;
3572}
3573
3574/*
3575 * Check if every tuple in the given page is visible to all current and future
3576 * transactions. Also return the visibility_cutoff_xid which is the highest
3577 * xmin amongst the visible tuples. Set *all_frozen to true if every tuple
3578 * on this page is frozen.
3579 *
3580 * This is a stripped down version of lazy_scan_prune(). If you change
3581 * anything here, make sure that everything stays in sync. Note that an
3582 * assertion calls us to verify that everybody still agrees. Be sure to avoid
3583 * introducing new side-effects here.
3584 */
3585static bool
3587 TransactionId *visibility_cutoff_xid,
3588 bool *all_frozen)
3589{
3590 Page page = BufferGetPage(buf);
3592 OffsetNumber offnum,
3593 maxoff;
3594 bool all_visible = true;
3595
3596 *visibility_cutoff_xid = InvalidTransactionId;
3597 *all_frozen = true;
3598
3599 maxoff = PageGetMaxOffsetNumber(page);
3600 for (offnum = FirstOffsetNumber;
3601 offnum <= maxoff && all_visible;
3602 offnum = OffsetNumberNext(offnum))
3603 {
3604 ItemId itemid;
3605 HeapTupleData tuple;
3606
3607 /*
3608 * Set the offset number so that we can display it along with any
3609 * error that occurred while processing this tuple.
3610 */
3611 vacrel->offnum = offnum;
3612 itemid = PageGetItemId(page, offnum);
3613
3614 /* Unused or redirect line pointers are of no interest */
3615 if (!ItemIdIsUsed(itemid) || ItemIdIsRedirected(itemid))
3616 continue;
3617
3618 ItemPointerSet(&(tuple.t_self), blockno, offnum);
3619
3620 /*
3621 * Dead line pointers can have index pointers pointing to them. So
3622 * they can't be treated as visible
3623 */
3624 if (ItemIdIsDead(itemid))
3625 {
3626 all_visible = false;
3627 *all_frozen = false;
3628 break;
3629 }
3630
3631 Assert(ItemIdIsNormal(itemid));
3632
3633 tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
3634 tuple.t_len = ItemIdGetLength(itemid);
3635 tuple.t_tableOid = RelationGetRelid(vacrel->rel);
3636
3637 switch (HeapTupleSatisfiesVacuum(&tuple, vacrel->cutoffs.OldestXmin,
3638 buf))
3639 {
3640 case HEAPTUPLE_LIVE:
3641 {
3642 TransactionId xmin;
3643
3644 /* Check comments in lazy_scan_prune. */
3646 {
3647 all_visible = false;
3648 *all_frozen = false;
3649 break;
3650 }
3651
3652 /*
3653 * The inserter definitely committed. But is it old enough
3654 * that everyone sees it as committed?
3655 */
3656 xmin = HeapTupleHeaderGetXmin(tuple.t_data);
3657 if (!TransactionIdPrecedes(xmin,
3658 vacrel->cutoffs.OldestXmin))
3659 {
3660 all_visible = false;
3661 *all_frozen = false;
3662 break;
3663 }
3664
3665 /* Track newest xmin on page. */
3666 if (TransactionIdFollows(xmin, *visibility_cutoff_xid) &&
3668 *visibility_cutoff_xid = xmin;
3669
3670 /* Check whether this tuple is already frozen or not */
3671 if (all_visible && *all_frozen &&
3673 *all_frozen = false;
3674 }
3675 break;
3676
3677 case HEAPTUPLE_DEAD:
3681 {
3682 all_visible = false;
3683 *all_frozen = false;
3684 break;
3685 }
3686 default:
3687 elog(ERROR, "unexpected HeapTupleSatisfiesVacuum result");
3688 break;
3689 }
3690 } /* scan along page */
3691
3692 /* Clear the offset information once we have processed the given page. */
3693 vacrel->offnum = InvalidOffsetNumber;
3694
3695 return all_visible;
3696}
3697
3698/*
3699 * Update index statistics in pg_class if the statistics are accurate.
3700 */
3701static void
3703{
3704 Relation *indrels = vacrel->indrels;
3705 int nindexes = vacrel->nindexes;
3706 IndexBulkDeleteResult **indstats = vacrel->indstats;
3707
3708 Assert(vacrel->do_index_cleanup);
3709
3710 for (int idx = 0; idx < nindexes; idx++)
3711 {
3712 Relation indrel = indrels[idx];
3713 IndexBulkDeleteResult *istat = indstats[idx];
3714
3715 if (istat == NULL || istat->estimated_count)
3716 continue;
3717
3718 /* Update index statistics */
3719 vac_update_relstats(indrel,
3720 istat->num_pages,
3721 istat->num_index_tuples,
3722 0,
3723 false,
3726 NULL, NULL, false);
3727 }
3728}
3729
3730/*
3731 * Error context callback for errors occurring during vacuum. The error
3732 * context messages for index phases should match the messages set in parallel
3733 * vacuum. If you change this function for those phases, change
3734 * parallel_vacuum_error_callback() as well.
3735 */
3736static void
3738{
3739 LVRelState *errinfo = arg;
3740
3741 switch (errinfo->phase)
3742 {
3744 if (BlockNumberIsValid(errinfo->blkno))
3745 {
3746 if (OffsetNumberIsValid(errinfo->offnum))
3747 errcontext("while scanning block %u offset %u of relation \"%s.%s\"",
3748 errinfo->blkno, errinfo->offnum, errinfo->relnamespace, errinfo->relname);
3749 else
3750 errcontext("while scanning block %u of relation \"%s.%s\"",
3751 errinfo->blkno, errinfo->relnamespace, errinfo->relname);
3752 }
3753 else
3754 errcontext("while scanning relation \"%s.%s\"",
3755 errinfo->relnamespace, errinfo->relname);
3756 break;
3757
3759 if (BlockNumberIsValid(errinfo->blkno))
3760 {
3761 if (OffsetNumberIsValid(errinfo->offnum))
3762 errcontext("while vacuuming block %u offset %u of relation \"%s.%s\"",
3763 errinfo->blkno, errinfo->offnum, errinfo->relnamespace, errinfo->relname);
3764 else
3765 errcontext("while vacuuming block %u of relation \"%s.%s\"",
3766 errinfo->blkno, errinfo->relnamespace, errinfo->relname);
3767 }
3768 else
3769 errcontext("while vacuuming relation \"%s.%s\"",
3770 errinfo->relnamespace, errinfo->relname);
3771 break;
3772
3774 errcontext("while vacuuming index \"%s\" of relation \"%s.%s\"",
3775 errinfo->indname, errinfo->relnamespace, errinfo->relname);
3776 break;
3777
3779 errcontext("while cleaning up index \"%s\" of relation \"%s.%s\"",
3780 errinfo->indname, errinfo->relnamespace, errinfo->relname);
3781 break;
3782
3784 if (BlockNumberIsValid(errinfo->blkno))
3785 errcontext("while truncating relation \"%s.%s\" to %u blocks",
3786 errinfo->relnamespace, errinfo->relname, errinfo->blkno);
3787 break;
3788
3790 default:
3791 return; /* do nothing; the errinfo may not be
3792 * initialized */
3793 }
3794}
3795
3796/*
3797 * Updates the information required for vacuum error callback. This also saves
3798 * the current information which can be later restored via restore_vacuum_error_info.
3799 */
3800static void
3802 int phase, BlockNumber blkno, OffsetNumber offnum)
3803{
3804 if (saved_vacrel)
3805 {
3806 saved_vacrel->offnum = vacrel->offnum;
3807 saved_vacrel->blkno = vacrel->blkno;
3808 saved_vacrel->phase = vacrel->phase;
3809 }
3810
3811 vacrel->blkno = blkno;
3812 vacrel->offnum = offnum;
3813 vacrel->phase = phase;
3814}
3815
3816/*
3817 * Restores the vacuum information saved via a prior call to update_vacuum_error_info.
3818 */
3819static void
3821 const LVSavedErrInfo *saved_vacrel)
3822{
3823 vacrel->blkno = saved_vacrel->blkno;
3824 vacrel->offnum = saved_vacrel->offnum;
3825 vacrel->phase = saved_vacrel->phase;
3826}
Datum idx(PG_FUNCTION_ARGS)
Definition: _int_op.c:259
int autovacuum_work_mem
Definition: autovacuum.c:120
void TimestampDifference(TimestampTz start_time, TimestampTz stop_time, long *secs, int *microsecs)
Definition: timestamp.c:1720
bool TimestampDifferenceExceeds(TimestampTz start_time, TimestampTz stop_time, int msec)
Definition: timestamp.c:1780
TimestampTz GetCurrentTimestamp(void)
Definition: timestamp.c:1644
void pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
void pgstat_progress_update_param(int index, int64 val)
void pgstat_progress_update_multi_param(int nparam, const int *index, const int64 *val)
void pgstat_progress_end_command(void)
@ PROGRESS_COMMAND_VACUUM
PgBackendStatus * MyBEEntry
uint32 BlockNumber
Definition: block.h:31
#define InvalidBlockNumber
Definition: block.h:33
static bool BlockNumberIsValid(BlockNumber blockNumber)
Definition: block.h:71
int Buffer
Definition: buf.h:23
#define InvalidBuffer
Definition: buf.h:25
bool track_io_timing
Definition: bufmgr.c:143
void CheckBufferIsPinnedOnce(Buffer buffer)
Definition: bufmgr.c:5144
BlockNumber BufferGetBlockNumber(Buffer buffer)
Definition: bufmgr.c:3721
PrefetchBufferResult PrefetchBuffer(Relation reln, ForkNumber forkNum, BlockNumber blockNum)
Definition: bufmgr.c:639
void ReleaseBuffer(Buffer buffer)
Definition: bufmgr.c:4863
void UnlockReleaseBuffer(Buffer buffer)
Definition: bufmgr.c:4880
void MarkBufferDirty(Buffer buffer)
Definition: bufmgr.c:2529
void LockBufferForCleanup(Buffer buffer)
Definition: bufmgr.c:5177
void LockBuffer(Buffer buffer, int mode)
Definition: bufmgr.c:5097
Buffer ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum, ReadBufferMode mode, BufferAccessStrategy strategy)
Definition: bufmgr.c:793
bool ConditionalLockBufferForCleanup(Buffer buffer)
Definition: bufmgr.c:5338
#define BUFFER_LOCK_UNLOCK
Definition: bufmgr.h:189
#define BUFFER_LOCK_SHARE
Definition: bufmgr.h:190
#define RelationGetNumberOfBlocks(reln)
Definition: bufmgr.h:273
static Page BufferGetPage(Buffer buffer)
Definition: bufmgr.h:396
#define BUFFER_LOCK_EXCLUSIVE
Definition: bufmgr.h:191
@ RBM_NORMAL
Definition: bufmgr.h:45
static bool BufferIsValid(Buffer bufnum)
Definition: bufmgr.h:347
Size PageGetHeapFreeSpace(const PageData *page)
Definition: bufpage.c:980
void PageTruncateLinePointerArray(Page page)
Definition: bufpage.c:824
static bool PageIsEmpty(const PageData *page)
Definition: bufpage.h:224
static bool PageIsAllVisible(const PageData *page)
Definition: bufpage.h:429
static void PageClearAllVisible(Page page)
Definition: bufpage.h:439
static Item PageGetItem(const PageData *page, const ItemIdData *itemId)
Definition: bufpage.h:354
static bool PageIsNew(const PageData *page)
Definition: bufpage.h:234
#define SizeOfPageHeaderData
Definition: bufpage.h:217
static void PageSetAllVisible(Page page)
Definition: bufpage.h:434
static ItemId PageGetItemId(Page page, OffsetNumber offsetNumber)
Definition: bufpage.h:244
PageData * Page
Definition: bufpage.h:82
static XLogRecPtr PageGetLSN(const PageData *page)
Definition: bufpage.h:386
static OffsetNumber PageGetMaxOffsetNumber(const PageData *page)
Definition: bufpage.h:372
uint8_t uint8
Definition: c.h:486
#define Max(x, y)
Definition: c.h:955
#define Assert(condition)
Definition: c.h:815
int64_t int64
Definition: c.h:485
TransactionId MultiXactId
Definition: c.h:619
int32_t int32
Definition: c.h:484
#define unlikely(x)
Definition: c.h:333
uint32_t uint32
Definition: c.h:488
#define lengthof(array)
Definition: c.h:745
#define StaticAssertStmt(condition, errmessage)
Definition: c.h:895
uint32 TransactionId
Definition: c.h:609
size_t Size
Definition: c.h:562
int64 TimestampTz
Definition: timestamp.h:39
char * get_database_name(Oid dbid)
Definition: dbcommands.c:3187
int errmsg_internal(const char *fmt,...)
Definition: elog.c:1157
int errdetail(const char *fmt,...)
Definition: elog.c:1203
ErrorContextCallback * error_context_stack
Definition: elog.c:94
int errhint(const char *fmt,...)
Definition: elog.c:1317
int errmsg(const char *fmt,...)
Definition: elog.c:1070
#define _(x)
Definition: elog.c:90
#define LOG
Definition: elog.h:31
#define errcontext
Definition: elog.h:196
#define WARNING
Definition: elog.h:36
#define DEBUG2
Definition: elog.h:29
#define ERROR
Definition: elog.h:39
#define elog(elevel,...)
Definition: elog.h:225
#define INFO
Definition: elog.h:34
#define ereport(elevel,...)
Definition: elog.h:149
void FreeSpaceMapVacuumRange(Relation rel, BlockNumber start, BlockNumber end)
Definition: freespace.c:377
Size GetRecordedFreeSpace(Relation rel, BlockNumber heapBlk)
Definition: freespace.c:244
void RecordPageWithFreeSpace(Relation rel, BlockNumber heapBlk, Size spaceAvail)
Definition: freespace.c:194
bool VacuumCostActive
Definition: globals.c:157
int VacuumCostBalance
Definition: globals.c:156
int maintenance_work_mem
Definition: globals.c:132
struct Latch * MyLatch
Definition: globals.c:62
Oid MyDatabaseId
Definition: globals.c:93
bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)
Definition: heapam.c:7716
bool heap_tuple_should_freeze(HeapTupleHeader tuple, const struct VacuumCutoffs *cutoffs, TransactionId *NoFreezePageRelfrozenXid, MultiXactId *NoFreezePageRelminMxid)
Definition: heapam.c:7771
#define HEAP_PAGE_PRUNE_FREEZE
Definition: heapam.h:43
@ HEAPTUPLE_RECENTLY_DEAD
Definition: heapam.h:136
@ HEAPTUPLE_INSERT_IN_PROGRESS
Definition: heapam.h:137
@ HEAPTUPLE_LIVE
Definition: heapam.h:135
@ HEAPTUPLE_DELETE_IN_PROGRESS
Definition: heapam.h:138
@ HEAPTUPLE_DEAD
Definition: heapam.h:134
@ PRUNE_VACUUM_CLEANUP
Definition: heapam.h:280
@ PRUNE_VACUUM_SCAN
Definition: heapam.h:279
#define HEAP_PAGE_PRUNE_MARK_UNUSED_NOW
Definition: heapam.h:42
HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin, Buffer buffer)
HeapTupleHeaderData * HeapTupleHeader
Definition: htup.h:23
static TransactionId HeapTupleHeaderGetXmin(const HeapTupleHeaderData *tup)
Definition: htup_details.h:324
#define MaxHeapTuplesPerPage
Definition: htup_details.h:624
static bool HeapTupleHeaderXminCommitted(const HeapTupleHeaderData *tup)
Definition: htup_details.h:337
int verbose
#define INSTR_TIME_SET_CURRENT(t)
Definition: instr_time.h:122
#define INSTR_TIME_SUBTRACT(x, y)
Definition: instr_time.h:181
#define INSTR_TIME_GET_MICROSEC(t)
Definition: instr_time.h:194
WalUsage pgWalUsage
Definition: instrument.c:22
void WalUsageAccumDiff(WalUsage *dst, const WalUsage *add, const WalUsage *sub)
Definition: instrument.c:286
BufferUsage pgBufferUsage
Definition: instrument.c:20
void BufferUsageAccumDiff(BufferUsage *dst, const BufferUsage *add, const BufferUsage *sub)
Definition: instrument.c:248
static int pg_cmp_u16(uint16 a, uint16 b)
Definition: int.h:640
int b
Definition: isn.c:69
int a
Definition: isn.c:68
int i
Definition: isn.c:72
#define ItemIdGetLength(itemId)
Definition: itemid.h:59
#define ItemIdIsNormal(itemId)
Definition: itemid.h:99
#define ItemIdIsDead(itemId)
Definition: itemid.h:113
#define ItemIdIsUsed(itemId)
Definition: itemid.h:92
#define ItemIdSetUnused(itemId)
Definition: itemid.h:128
#define ItemIdIsRedirected(itemId)
Definition: itemid.h:106
#define ItemIdHasStorage(itemId)
Definition: itemid.h:120
static void ItemPointerSet(ItemPointerData *pointer, BlockNumber blockNumber, OffsetNumber offNum)
Definition: itemptr.h:135
void ResetLatch(Latch *latch)
Definition: latch.c:724
int WaitLatch(Latch *latch, int wakeEvents, long timeout, uint32 wait_event_info)
Definition: latch.c:517
#define WL_TIMEOUT
Definition: latch.h:130
#define WL_EXIT_ON_PM_DEATH
Definition: latch.h:132
#define WL_LATCH_SET
Definition: latch.h:127
void UnlockRelation(Relation relation, LOCKMODE lockmode)
Definition: lmgr.c:309
bool ConditionalLockRelation(Relation relation, LOCKMODE lockmode)
Definition: lmgr.c:274
bool LockHasWaitersRelation(Relation relation, LOCKMODE lockmode)
Definition: lmgr.c:362
#define NoLock
Definition: lockdefs.h:34
#define AccessExclusiveLock
Definition: lockdefs.h:43
#define RowExclusiveLock
Definition: lockdefs.h:38
char * get_namespace_name(Oid nspid)
Definition: lsyscache.c:3393
char * pstrdup(const char *in)
Definition: mcxt.c:1696
void pfree(void *pointer)
Definition: mcxt.c:1521
void * palloc0(Size size)
Definition: mcxt.c:1347
void * palloc(Size size)
Definition: mcxt.c:1317
#define AmAutoVacuumWorkerProcess()
Definition: miscadmin.h:381
#define START_CRIT_SECTION()
Definition: miscadmin.h:149
#define CHECK_FOR_INTERRUPTS()
Definition: miscadmin.h:122
#define END_CRIT_SECTION()
Definition: miscadmin.h:151
bool MultiXactIdPrecedes(MultiXactId multi1, MultiXactId multi2)
Definition: multixact.c:3317
bool MultiXactIdPrecedesOrEquals(MultiXactId multi1, MultiXactId multi2)
Definition: multixact.c:3331
#define MultiXactIdIsValid(multi)
Definition: multixact.h:28
#define InvalidMultiXactId
Definition: multixact.h:24
#define InvalidOffsetNumber
Definition: off.h:26
#define OffsetNumberIsValid(offsetNumber)
Definition: off.h:39
#define OffsetNumberNext(offsetNumber)
Definition: off.h:52
uint16 OffsetNumber
Definition: off.h:24
#define FirstOffsetNumber
Definition: off.h:27
#define MaxOffsetNumber
Definition: off.h:28
void * arg
uint32 pg_prng_uint32(pg_prng_state *state)
Definition: pg_prng.c:227
pg_prng_state pg_global_prng_state
Definition: pg_prng.c:34
const char * pg_rusage_show(const PGRUsage *ru0)
Definition: pg_rusage.c:40
void pg_rusage_init(PGRUsage *ru0)
Definition: pg_rusage.c:27
static char * buf
Definition: pg_test_fsync.c:72
int64 PgStat_Counter
Definition: pgstat.h:66
PgStat_Counter pgStatBlockReadTime
PgStat_Counter pgStatBlockWriteTime
void pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter livetuples, PgStat_Counter deadtuples, TimestampTz starttime)
#define qsort(a, b, c, d)
Definition: port.h:475
GlobalVisState * GlobalVisTestFor(Relation rel)
Definition: procarray.c:4107
#define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP
Definition: progress.h:39
#define PROGRESS_VACUUM_DEAD_TUPLE_BYTES
Definition: progress.h:27
#define PROGRESS_VACUUM_PHASE_SCAN_HEAP
Definition: progress.h:34
#define PROGRESS_VACUUM_TOTAL_HEAP_BLKS
Definition: progress.h:22
#define PROGRESS_VACUUM_PHASE
Definition: progress.h:21
#define PROGRESS_VACUUM_DELAY_TIME
Definition: progress.h:31
#define PROGRESS_VACUUM_NUM_INDEX_VACUUMS
Definition: progress.h:25
#define PROGRESS_VACUUM_PHASE_VACUUM_HEAP
Definition: progress.h:36
#define PROGRESS_VACUUM_NUM_DEAD_ITEM_IDS
Definition: progress.h:28
#define PROGRESS_VACUUM_MAX_DEAD_TUPLE_BYTES
Definition: progress.h:26
#define PROGRESS_VACUUM_HEAP_BLKS_SCANNED
Definition: progress.h:23
#define PROGRESS_VACUUM_PHASE_INDEX_CLEANUP
Definition: progress.h:37
#define PROGRESS_VACUUM_PHASE_VACUUM_INDEX
Definition: progress.h:35
#define PROGRESS_VACUUM_INDEXES_PROCESSED
Definition: progress.h:30
#define PROGRESS_VACUUM_INDEXES_TOTAL
Definition: progress.h:29
#define PROGRESS_VACUUM_HEAP_BLKS_VACUUMED
Definition: progress.h:24
#define PROGRESS_VACUUM_PHASE_TRUNCATE
Definition: progress.h:38
void heap_page_prune_and_freeze(Relation relation, Buffer buffer, GlobalVisState *vistest, int options, struct VacuumCutoffs *cutoffs, PruneFreezeResult *presult, PruneReason reason, OffsetNumber *off_loc, TransactionId *new_relfrozen_xid, MultiXactId *new_relmin_mxid)
Definition: pruneheap.c:350
void log_heap_prune_and_freeze(Relation relation, Buffer buffer, TransactionId conflict_xid, bool cleanup_lock, PruneReason reason, HeapTupleFreeze *frozen, int nfrozen, OffsetNumber *redirected, int nredirected, OffsetNumber *dead, int ndead, OffsetNumber *unused, int nunused)
Definition: pruneheap.c:2053
Buffer read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
Definition: read_stream.c:616
ReadStream * read_stream_begin_relation(int flags, BufferAccessStrategy strategy, Relation rel, ForkNumber forknum, ReadStreamBlockNumberCB callback, void *callback_private_data, size_t per_buffer_data_size)
Definition: read_stream.c:562
void read_stream_end(ReadStream *stream)
Definition: read_stream.c:872
#define READ_STREAM_MAINTENANCE
Definition: read_stream.h:28
#define RelationGetRelid(relation)
Definition: rel.h:512
#define RelationGetRelationName(relation)
Definition: rel.h:546
#define RelationNeedsWAL(relation)
Definition: rel.h:635
#define RelationUsesLocalBuffers(relation)
Definition: rel.h:644
#define RelationGetNamespace(relation)
Definition: rel.h:553
@ MAIN_FORKNUM
Definition: relpath.h:58
void RelationTruncate(Relation rel, BlockNumber nblocks)
Definition: storage.c:288
void appendStringInfo(StringInfo str, const char *fmt,...)
Definition: stringinfo.c:145
void appendStringInfoString(StringInfo str, const char *s)
Definition: stringinfo.c:230
void initStringInfo(StringInfo str)
Definition: stringinfo.c:97
int64 shared_blks_dirtied
Definition: instrument.h:28
int64 local_blks_hit
Definition: instrument.h:30
int64 shared_blks_read
Definition: instrument.h:27
int64 local_blks_read
Definition: instrument.h:31
int64 local_blks_dirtied
Definition: instrument.h:32
int64 shared_blks_hit
Definition: instrument.h:26
struct ErrorContextCallback * previous
Definition: elog.h:296
void(* callback)(void *arg)
Definition: elog.h:297
ItemPointerData t_self
Definition: htup.h:65
uint32 t_len
Definition: htup.h:64
HeapTupleHeader t_data
Definition: htup.h:68
Oid t_tableOid
Definition: htup.h:66
bool estimated_count
Definition: genam.h:80
BlockNumber pages_deleted
Definition: genam.h:84
BlockNumber pages_newly_deleted
Definition: genam.h:83
BlockNumber pages_free
Definition: genam.h:85
BlockNumber num_pages
Definition: genam.h:79
double num_index_tuples
Definition: genam.h:81
Relation index
Definition: genam.h:48
double num_heap_tuples
Definition: genam.h:54
bool analyze_only
Definition: genam.h:50
BufferAccessStrategy strategy
Definition: genam.h:55
Relation heaprel
Definition: genam.h:49
bool report_progress
Definition: genam.h:51
int message_level
Definition: genam.h:53
bool estimated_count
Definition: genam.h:52
BlockNumber next_eager_scan_region_start
Definition: vacuumlazy.c:378
ParallelVacuumState * pvs
Definition: vacuumlazy.c:268
bool next_unskippable_eager_scanned
Definition: vacuumlazy.c:363
bool verbose
Definition: vacuumlazy.c:298
VacDeadItemsInfo * dead_items_info
Definition: vacuumlazy.c:311
BlockNumber vm_new_frozen_pages
Definition: vacuumlazy.c:337
int nindexes
Definition: vacuumlazy.c:264
Buffer next_unskippable_vmbuffer
Definition: vacuumlazy.c:364
OffsetNumber offnum
Definition: vacuumlazy.c:296
TidStore * dead_items
Definition: vacuumlazy.c:310
int64 tuples_deleted
Definition: vacuumlazy.c:352
BlockNumber nonempty_pages
Definition: vacuumlazy.c:341
BlockNumber eager_scan_remaining_fails
Definition: vacuumlazy.c:410
bool do_rel_truncate
Definition: vacuumlazy.c:280
BlockNumber scanned_pages
Definition: vacuumlazy.c:314
bool aggressive
Definition: vacuumlazy.c:271
BlockNumber new_frozen_tuple_pages
Definition: vacuumlazy.c:323
GlobalVisState * vistest
Definition: vacuumlazy.c:284
BlockNumber removed_pages
Definition: vacuumlazy.c:322
int num_index_scans
Definition: vacuumlazy.c:350
IndexBulkDeleteResult ** indstats
Definition: vacuumlazy.c:347
double new_live_tuples
Definition: vacuumlazy.c:345
double new_rel_tuples
Definition: vacuumlazy.c:344
TransactionId NewRelfrozenXid
Definition: vacuumlazy.c:286
Relation rel
Definition: vacuumlazy.c:262
bool consider_bypass_optimization
Definition: vacuumlazy.c:275
BlockNumber rel_pages
Definition: vacuumlazy.c:313
BlockNumber next_unskippable_block
Definition: vacuumlazy.c:361
int64 recently_dead_tuples
Definition: vacuumlazy.c:356
int64 tuples_frozen
Definition: vacuumlazy.c:353
char * dbname
Definition: vacuumlazy.c:291
BlockNumber missed_dead_pages
Definition: vacuumlazy.c:340
BlockNumber current_block
Definition: vacuumlazy.c:360
char * relnamespace
Definition: vacuumlazy.c:292
int64 live_tuples
Definition: vacuumlazy.c:355
int64 lpdead_items
Definition: vacuumlazy.c:354
BufferAccessStrategy bstrategy
Definition: vacuumlazy.c:267
BlockNumber eager_scan_remaining_successes
Definition: vacuumlazy.c:389
bool skippedallvis
Definition: vacuumlazy.c:288
BlockNumber lpdead_item_pages
Definition: vacuumlazy.c:339
BlockNumber eager_scanned_pages
Definition: vacuumlazy.c:320
Relation * indrels
Definition: vacuumlazy.c:263
bool skipwithvm
Definition: vacuumlazy.c:273
bool do_index_cleanup
Definition: vacuumlazy.c:279
MultiXactId NewRelminMxid
Definition: vacuumlazy.c:287
int64 missed_dead_tuples
Definition: vacuumlazy.c:357
BlockNumber blkno
Definition: vacuumlazy.c:295
struct VacuumCutoffs cutoffs
Definition: vacuumlazy.c:283
bool next_unskippable_allvis
Definition: vacuumlazy.c:362
BlockNumber vm_new_visible_pages
Definition: vacuumlazy.c:326
char * relname
Definition: vacuumlazy.c:293
BlockNumber eager_scan_max_fails_per_region
Definition: vacuumlazy.c:400
VacErrPhase phase
Definition: vacuumlazy.c:297
char * indname
Definition: vacuumlazy.c:294
BlockNumber vm_new_visible_frozen_pages
Definition: vacuumlazy.c:334
bool do_index_vacuuming
Definition: vacuumlazy.c:278
BlockNumber blkno
Definition: vacuumlazy.c:417
VacErrPhase phase
Definition: vacuumlazy.c:419
OffsetNumber offnum
Definition: vacuumlazy.c:418
int64 st_progress_param[PGSTAT_NUM_PROGRESS_PARAM]
int recently_dead_tuples
Definition: heapam.h:243
TransactionId vm_conflict_horizon
Definition: heapam.h:258
OffsetNumber deadoffsets[MaxHeapTuplesPerPage]
Definition: heapam.h:272
bool all_visible
Definition: heapam.h:256
Form_pg_class rd_rel
Definition: rel.h:111
BlockNumber blkno
Definition: tidstore.h:29
size_t max_bytes
Definition: vacuum.h:294
int64 num_items
Definition: vacuum.h:295
TransactionId FreezeLimit
Definition: vacuum.h:284
TransactionId OldestXmin
Definition: vacuum.h:274
TransactionId relfrozenxid
Definition: vacuum.h:258
MultiXactId relminmxid
Definition: vacuum.h:259
MultiXactId MultiXactCutoff
Definition: vacuum.h:285
MultiXactId OldestMxact
Definition: vacuum.h:275
int nworkers
Definition: vacuum.h:246
VacOptValue truncate
Definition: vacuum.h:231
bits32 options
Definition: vacuum.h:219
bool is_wraparound
Definition: vacuum.h:226
int log_min_duration
Definition: vacuum.h:227
VacOptValue index_cleanup
Definition: vacuum.h:230
double max_eager_freeze_failure_rate
Definition: vacuum.h:239
uint64 wal_bytes
Definition: instrument.h:55
int64 wal_fpi
Definition: instrument.h:54
int64 wal_records
Definition: instrument.h:53
TidStoreIter * TidStoreBeginIterate(TidStore *ts)
Definition: tidstore.c:471
void TidStoreEndIterate(TidStoreIter *iter)
Definition: tidstore.c:518
TidStoreIterResult * TidStoreIterateNext(TidStoreIter *iter)
Definition: tidstore.c:493
TidStore * TidStoreCreateLocal(size_t max_bytes, bool insert_only)
Definition: tidstore.c:162
void TidStoreDestroy(TidStore *ts)
Definition: tidstore.c:317
int TidStoreGetBlockOffsets(TidStoreIterResult *result, OffsetNumber *offsets, int max_offsets)
Definition: tidstore.c:566
void TidStoreSetBlockOffsets(TidStore *ts, BlockNumber blkno, OffsetNumber *offsets, int num_offsets)
Definition: tidstore.c:345
size_t TidStoreMemoryUsage(TidStore *ts)
Definition: tidstore.c:532
bool TransactionIdPrecedes(TransactionId id1, TransactionId id2)
Definition: transam.c:280
bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2)
Definition: transam.c:299
bool TransactionIdFollows(TransactionId id1, TransactionId id2)
Definition: transam.c:314
static TransactionId ReadNextTransactionId(void)
Definition: transam.h:315
#define InvalidTransactionId
Definition: transam.h:31
#define TransactionIdIsValid(xid)
Definition: transam.h:41
#define TransactionIdIsNormal(xid)
Definition: transam.h:42
bool track_cost_delay_timing
Definition: vacuum.c:80
void vac_open_indexes(Relation relation, LOCKMODE lockmode, int *nindexes, Relation **Irel)
Definition: vacuum.c:2323
void vac_update_relstats(Relation relation, BlockNumber num_pages, double num_tuples, BlockNumber num_all_visible_pages, bool hasindex, TransactionId frozenxid, MultiXactId minmulti, bool *frozenxid_updated, bool *minmulti_updated, bool in_outer_xact)
Definition: vacuum.c:1427
IndexBulkDeleteResult * vac_cleanup_one_index(IndexVacuumInfo *ivinfo, IndexBulkDeleteResult *istat)
Definition: vacuum.c:2615
void vac_close_indexes(int nindexes, Relation *Irel, LOCKMODE lockmode)
Definition: vacuum.c:2366
void vacuum_delay_point(bool is_analyze)
Definition: vacuum.c:2387
bool vacuum_get_cutoffs(Relation rel, const VacuumParams *params, struct VacuumCutoffs *cutoffs)
Definition: vacuum.c:1101
bool vacuum_xid_failsafe_check(const struct VacuumCutoffs *cutoffs)
Definition: vacuum.c:1269
bool VacuumFailsafeActive
Definition: vacuum.c:107
double vac_estimate_reltuples(Relation relation, BlockNumber total_pages, BlockNumber scanned_pages, double scanned_tuples)
Definition: vacuum.c:1331
IndexBulkDeleteResult * vac_bulkdel_one_index(IndexVacuumInfo *ivinfo, IndexBulkDeleteResult *istat, TidStore *dead_items, VacDeadItemsInfo *dead_items_info)
Definition: vacuum.c:2594
#define VACOPT_VERBOSE
Definition: vacuum.h:182
@ VACOPTVALUE_AUTO
Definition: vacuum.h:203
@ VACOPTVALUE_ENABLED
Definition: vacuum.h:205
@ VACOPTVALUE_UNSPECIFIED
Definition: vacuum.h:202
@ VACOPTVALUE_DISABLED
Definition: vacuum.h:204
#define VACOPT_DISABLE_PAGE_SKIPPING
Definition: vacuum.h:188
static void dead_items_cleanup(LVRelState *vacrel)
Definition: vacuumlazy.c:3561
static bool heap_page_is_all_visible(LVRelState *vacrel, Buffer buf, TransactionId *visibility_cutoff_xid, bool *all_frozen)
Definition: vacuumlazy.c:3586
#define VAC_BLK_WAS_EAGER_SCANNED
Definition: vacuumlazy.c:256
static void update_relstats_all_indexes(LVRelState *vacrel)
Definition: vacuumlazy.c:3702
static void dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *offsets, int num_offsets)
Definition: vacuumlazy.c:3519
static void lazy_scan_prune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, Buffer vmbuffer, bool all_visible_according_to_vm, bool *has_lpdead_items, bool *vm_page_frozen)
Definition: vacuumlazy.c:1926
static BlockNumber heap_vac_scan_next_block(ReadStream *stream, void *callback_private_data, void *per_buffer_data)
Definition: vacuumlazy.c:1546
#define VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL
Definition: vacuumlazy.c:180
static void vacuum_error_callback(void *arg)
Definition: vacuumlazy.c:3737
#define EAGER_SCAN_REGION_SIZE
Definition: vacuumlazy.c:250
static void lazy_truncate_heap(LVRelState *vacrel)
Definition: vacuumlazy.c:3181
static void lazy_vacuum(LVRelState *vacrel)
Definition: vacuumlazy.c:2430
static void lazy_cleanup_all_indexes(LVRelState *vacrel)
Definition: vacuumlazy.c:2984
#define MAX_EAGER_FREEZE_SUCCESS_RATE
Definition: vacuumlazy.c:241
static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, bool *has_lpdead_items)
Definition: vacuumlazy.c:2219
static BlockNumber vacuum_reap_lp_read_stream_next(ReadStream *stream, void *callback_private_data, void *per_buffer_data)
Definition: vacuumlazy.c:2660
#define REL_TRUNCATE_MINIMUM
Definition: vacuumlazy.c:169
static bool should_attempt_truncation(LVRelState *vacrel)
Definition: vacuumlazy.c:3161
static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, bool sharelock, Buffer vmbuffer)
Definition: vacuumlazy.c:1783
VacErrPhase
Definition: vacuumlazy.c:225
@ VACUUM_ERRCB_PHASE_SCAN_HEAP
Definition: vacuumlazy.c:227
@ VACUUM_ERRCB_PHASE_VACUUM_INDEX
Definition: vacuumlazy.c:228
@ VACUUM_ERRCB_PHASE_TRUNCATE
Definition: vacuumlazy.c:231
@ VACUUM_ERRCB_PHASE_INDEX_CLEANUP
Definition: vacuumlazy.c:230
@ VACUUM_ERRCB_PHASE_VACUUM_HEAP
Definition: vacuumlazy.c:229
@ VACUUM_ERRCB_PHASE_UNKNOWN
Definition: vacuumlazy.c:226
static void lazy_scan_heap(LVRelState *vacrel)
Definition: vacuumlazy.c:1188
#define ParallelVacuumIsActive(vacrel)
Definition: vacuumlazy.c:221
static void restore_vacuum_error_info(LVRelState *vacrel, const LVSavedErrInfo *saved_vacrel)
Definition: vacuumlazy.c:3820
void heap_vacuum_rel(Relation rel, VacuumParams *params, BufferAccessStrategy bstrategy)
Definition: vacuumlazy.c:615
static IndexBulkDeleteResult * lazy_vacuum_one_index(Relation indrel, IndexBulkDeleteResult *istat, double reltuples, LVRelState *vacrel)
Definition: vacuumlazy.c:3052
static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
Definition: vacuumlazy.c:1651
static void dead_items_reset(LVRelState *vacrel)
Definition: vacuumlazy.c:3541
#define REL_TRUNCATE_FRACTION
Definition: vacuumlazy.c:170
static bool lazy_check_wraparound_failsafe(LVRelState *vacrel)
Definition: vacuumlazy.c:2931
struct LVSavedErrInfo LVSavedErrInfo
static IndexBulkDeleteResult * lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat, double reltuples, bool estimated_count, LVRelState *vacrel)
Definition: vacuumlazy.c:3101
#define PREFETCH_SIZE
Definition: vacuumlazy.c:215
static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer, OffsetNumber *deadoffsets, int num_offsets, Buffer vmbuffer)
Definition: vacuumlazy.c:2808
struct LVRelState LVRelState
static void heap_vacuum_eager_scan_setup(LVRelState *vacrel, VacuumParams *params)
Definition: vacuumlazy.c:488
#define BYPASS_THRESHOLD_PAGES
Definition: vacuumlazy.c:187
static void dead_items_alloc(LVRelState *vacrel, int nworkers)
Definition: vacuumlazy.c:3454
#define VACUUM_TRUNCATE_LOCK_TIMEOUT
Definition: vacuumlazy.c:181
static bool lazy_vacuum_all_indexes(LVRelState *vacrel)
Definition: vacuumlazy.c:2555
static void update_vacuum_error_info(LVRelState *vacrel, LVSavedErrInfo *saved_vacrel, int phase, BlockNumber blkno, OffsetNumber offnum)
Definition: vacuumlazy.c:3801
static BlockNumber count_nondeletable_pages(LVRelState *vacrel, bool *lock_waiter_detected)
Definition: vacuumlazy.c:3312
#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM
Definition: vacuumlazy.c:257
#define SKIP_PAGES_THRESHOLD
Definition: vacuumlazy.c:209
#define FAILSAFE_EVERY_PAGES
Definition: vacuumlazy.c:193
#define VACUUM_TRUNCATE_LOCK_CHECK_INTERVAL
Definition: vacuumlazy.c:179
static int cmpOffsetNumbers(const void *a, const void *b)
Definition: vacuumlazy.c:1903
static void lazy_vacuum_heap_rel(LVRelState *vacrel)
Definition: vacuumlazy.c:2698
#define VACUUM_FSM_EVERY_PAGES
Definition: vacuumlazy.c:202
TidStore * parallel_vacuum_get_dead_items(ParallelVacuumState *pvs, VacDeadItemsInfo **dead_items_info_p)
ParallelVacuumState * parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes, int nrequested_workers, int vac_work_mem, int elevel, BufferAccessStrategy bstrategy)
void parallel_vacuum_bulkdel_all_indexes(ParallelVacuumState *pvs, long num_table_tuples, int num_index_scans)
void parallel_vacuum_reset_dead_items(ParallelVacuumState *pvs)
void parallel_vacuum_cleanup_all_indexes(ParallelVacuumState *pvs, long num_table_tuples, int num_index_scans, bool estimated_count)
void parallel_vacuum_end(ParallelVacuumState *pvs, IndexBulkDeleteResult **istats)
bool visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags)
void visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen)
uint8 visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid, uint8 flags)
#define VM_ALL_FROZEN(r, b, v)
Definition: visibilitymap.h:26
#define VISIBILITYMAP_VALID_BITS
#define VISIBILITYMAP_ALL_FROZEN
#define VISIBILITYMAP_ALL_VISIBLE
bool IsInParallelMode(void)
Definition: xact.c:1088
#define InvalidXLogRecPtr
Definition: xlogdefs.h:28
XLogRecPtr log_newpage_buffer(Buffer buffer, bool page_std)
Definition: xloginsert.c:1237