PostgreSQL Source Code git master
Loading...
Searching...
No Matches
applyparallelworker.c
Go to the documentation of this file.
1/*-------------------------------------------------------------------------
2 * applyparallelworker.c
3 * Support routines for applying xact by parallel apply worker
4 *
5 * Copyright (c) 2023-2026, PostgreSQL Global Development Group
6 *
7 * IDENTIFICATION
8 * src/backend/replication/logical/applyparallelworker.c
9 *
10 * This file contains the code to launch, set up, and teardown a parallel apply
11 * worker which receives the changes from the leader worker and invokes routines
12 * to apply those on the subscriber database. Additionally, this file contains
13 * routines that are intended to support setting up, using, and tearing down a
14 * ParallelApplyWorkerInfo which is required so the leader worker and parallel
15 * apply workers can communicate with each other.
16 *
17 * The parallel apply workers are assigned (if available) as soon as xact's
18 * first stream is received for subscriptions that have set their 'streaming'
19 * option as parallel. The leader apply worker will send changes to this new
20 * worker via shared memory. We keep this worker assigned till the transaction
21 * commit is received and also wait for the worker to finish at commit. This
22 * preserves commit ordering and avoid file I/O in most cases, although we
23 * still need to spill to a file if there is no worker available. See comments
24 * atop logical/worker to know more about streamed xacts whose changes are
25 * spilled to disk. It is important to maintain commit order to avoid failures
26 * due to: (a) transaction dependencies - say if we insert a row in the first
27 * transaction and update it in the second transaction on publisher then
28 * allowing the subscriber to apply both in parallel can lead to failure in the
29 * update; (b) deadlocks - allowing transactions that update the same set of
30 * rows/tables in the opposite order to be applied in parallel can lead to
31 * deadlocks.
32 *
33 * A worker pool is used to avoid restarting workers for each streaming
34 * transaction. We maintain each worker's information (ParallelApplyWorkerInfo)
35 * in the ParallelApplyWorkerPool. After successfully launching a new worker,
36 * its information is added to the ParallelApplyWorkerPool. Once the worker
37 * finishes applying the transaction, it is marked as available for re-use.
38 * Now, before starting a new worker to apply the streaming transaction, we
39 * check the list for any available worker. Note that we retain a maximum of
40 * half the max_parallel_apply_workers_per_subscription workers in the pool and
41 * after that, we simply exit the worker after applying the transaction.
42 *
43 * XXX This worker pool threshold is arbitrary and we can provide a GUC
44 * variable for this in the future if required.
45 *
46 * The leader apply worker will create a separate dynamic shared memory segment
47 * when each parallel apply worker starts. The reason for this design is that
48 * we cannot predict how many workers will be needed. It may be possible to
49 * allocate enough shared memory in one segment based on the maximum number of
50 * parallel apply workers (max_parallel_apply_workers_per_subscription), but
51 * this would waste memory if no process is actually started.
52 *
53 * The dynamic shared memory segment contains: (a) a shm_mq that is used to
54 * send changes in the transaction from leader apply worker to parallel apply
55 * worker; (b) another shm_mq that is used to send errors (and other messages
56 * reported via elog/ereport) from the parallel apply worker to leader apply
57 * worker; (c) necessary information to be shared among parallel apply workers
58 * and the leader apply worker (i.e. members of ParallelApplyWorkerShared).
59 *
60 * Locking Considerations
61 * ----------------------
62 * We have a risk of deadlock due to concurrently applying the transactions in
63 * parallel mode that were independent on the publisher side but became
64 * dependent on the subscriber side due to the different database structures
65 * (like schema of subscription tables, constraints, etc.) on each side. This
66 * can happen even without parallel mode when there are concurrent operations
67 * on the subscriber. In order to detect the deadlocks among leader (LA) and
68 * parallel apply (PA) workers, we used lmgr locks when the PA waits for the
69 * next stream (set of changes) and LA waits for PA to finish the transaction.
70 * An alternative approach could be to not allow parallelism when the schema of
71 * tables is different between the publisher and subscriber but that would be
72 * too restrictive and would require the publisher to send much more
73 * information than it is currently sending.
74 *
75 * Consider a case where the subscribed table does not have a unique key on the
76 * publisher and has a unique key on the subscriber. The deadlock can happen in
77 * the following ways:
78 *
79 * 1) Deadlock between the leader apply worker and a parallel apply worker
80 *
81 * Consider that the parallel apply worker (PA) is executing TX-1 and the
82 * leader apply worker (LA) is executing TX-2 concurrently on the subscriber.
83 * Now, LA is waiting for PA because of the unique key constraint of the
84 * subscribed table while PA is waiting for LA to send the next stream of
85 * changes or transaction finish command message.
86 *
87 * In order for lmgr to detect this, we have LA acquire a session lock on the
88 * remote transaction (by pa_lock_stream()) and have PA wait on the lock before
89 * trying to receive the next stream of changes. Specifically, LA will acquire
90 * the lock in AccessExclusive mode before sending the STREAM_STOP and will
91 * release it if already acquired after sending the STREAM_START, STREAM_ABORT
92 * (for toplevel transaction), STREAM_PREPARE, and STREAM_COMMIT. The PA will
93 * acquire the lock in AccessShare mode after processing STREAM_STOP and
94 * STREAM_ABORT (for subtransaction) and then release the lock immediately
95 * after acquiring it.
96 *
97 * The lock graph for the above example will look as follows:
98 * LA (waiting to acquire the lock on the unique index) -> PA (waiting to
99 * acquire the stream lock) -> LA
100 *
101 * This way, when PA is waiting for LA for the next stream of changes, we can
102 * have a wait-edge from PA to LA in lmgr, which will make us detect the
103 * deadlock between LA and PA.
104 *
105 * 2) Deadlock between the leader apply worker and parallel apply workers
106 *
107 * This scenario is similar to the first case but TX-1 and TX-2 are executed by
108 * two parallel apply workers (PA-1 and PA-2 respectively). In this scenario,
109 * PA-2 is waiting for PA-1 to complete its transaction while PA-1 is waiting
110 * for subsequent input from LA. Also, LA is waiting for PA-2 to complete its
111 * transaction in order to preserve the commit order. There is a deadlock among
112 * the three processes.
113 *
114 * In order for lmgr to detect this, we have PA acquire a session lock (this is
115 * a different lock than referred in the previous case, see
116 * pa_lock_transaction()) on the transaction being applied and have LA wait on
117 * the lock before proceeding in the transaction finish commands. Specifically,
118 * PA will acquire this lock in AccessExclusive mode before executing the first
119 * message of the transaction and release it at the xact end. LA will acquire
120 * this lock in AccessShare mode at transaction finish commands and release it
121 * immediately.
122 *
123 * The lock graph for the above example will look as follows:
124 * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to acquire the
125 * lock due to unique index constraint) -> PA-1 (waiting to acquire the stream
126 * lock) -> LA
127 *
128 * This way when LA is waiting to finish the transaction end command to preserve
129 * the commit order, we will be able to detect deadlock, if any.
130 *
131 * One might think we can use XactLockTableWait(), but XactLockTableWait()
132 * considers PREPARED TRANSACTION as still in progress which means the lock
133 * won't be released even after the parallel apply worker has prepared the
134 * transaction.
135 *
136 * 3) Deadlock when the shm_mq buffer is full
137 *
138 * In the previous scenario (ie. PA-1 and PA-2 are executing transactions
139 * concurrently), if the shm_mq buffer between LA and PA-2 is full, LA has to
140 * wait to send messages, and this wait doesn't appear in lmgr.
141 *
142 * To avoid this wait, we use a non-blocking write and wait with a timeout. If
143 * the timeout is exceeded, the LA will serialize all the pending messages to
144 * a file and indicate PA-2 that it needs to read that file for the remaining
145 * messages. Then LA will start waiting for commit as in the previous case
146 * which will detect deadlock if any. See pa_send_data() and
147 * enum TransApplyAction.
148 *
149 * Lock types
150 * ----------
151 * Both the stream lock and the transaction lock mentioned above are
152 * session-level locks because both locks could be acquired outside the
153 * transaction, and the stream lock in the leader needs to persist across
154 * transaction boundaries i.e. until the end of the streaming transaction.
155 *-------------------------------------------------------------------------
156 */
157
158#include "postgres.h"
159
160#include "libpq/pqformat.h"
161#include "libpq/pqmq.h"
162#include "pgstat.h"
163#include "postmaster/interrupt.h"
166#include "replication/origin.h"
168#include "storage/ipc.h"
169#include "storage/latch.h"
170#include "storage/lmgr.h"
171#include "storage/proc.h"
172#include "tcop/tcopprot.h"
173#include "utils/inval.h"
174#include "utils/memutils.h"
175#include "utils/syscache.h"
176
177#define PG_LOGICAL_APPLY_SHM_MAGIC 0x787ca067
178
179/*
180 * DSM keys for parallel apply worker. Unlike other parallel execution code,
181 * since we don't need to worry about DSM keys conflicting with plan_node_id we
182 * can use small integers.
183 */
184#define PARALLEL_APPLY_KEY_SHARED 1
185#define PARALLEL_APPLY_KEY_MQ 2
186#define PARALLEL_APPLY_KEY_ERROR_QUEUE 3
187
188/* Queue size of DSM, 16 MB for now. */
189#define DSM_QUEUE_SIZE (16 * 1024 * 1024)
190
191/*
192 * Error queue size of DSM. It is desirable to make it large enough that a
193 * typical ErrorResponse can be sent without blocking. That way, a worker that
194 * errors out can write the whole message into the queue and terminate without
195 * waiting for the user backend.
196 */
197#define DSM_ERROR_QUEUE_SIZE (16 * 1024)
198
199/*
200 * There are three fields in each message received by the parallel apply
201 * worker: start_lsn, end_lsn and send_time. Because we have updated these
202 * statistics in the leader apply worker, we can ignore these fields in the
203 * parallel apply worker (see function LogicalRepApplyLoop).
204 */
205#define SIZE_STATS_MESSAGE (2 * sizeof(XLogRecPtr) + sizeof(TimestampTz))
206
207/*
208 * The type of session-level lock on a transaction being applied on a logical
209 * replication subscriber.
210 */
211#define PARALLEL_APPLY_LOCK_STREAM 0
212#define PARALLEL_APPLY_LOCK_XACT 1
213
214/*
215 * Hash table entry to map xid to the parallel apply worker state.
216 */
222
223/*
224 * A hash table used to cache the state of streaming transactions being applied
225 * by the parallel apply workers.
226 */
228
229/*
230* A list (pool) of active parallel apply workers. The information for
231* the new worker is added to the list after successfully launching it. The
232* list entry is removed if there are already enough workers in the worker
233* pool at the end of the transaction. For more information about the worker
234* pool, see comments atop this file.
235 */
237
238/*
239 * Information shared between leader apply worker and parallel apply worker.
240 */
242
243/*
244 * Is there a message sent by a parallel apply worker that the leader apply
245 * worker needs to receive?
246 */
248
249/*
250 * Cache the parallel apply worker information required for applying the
251 * current streaming transaction. It is used to save the cost of searching the
252 * hash table when applying the changes between STREAM_START and STREAM_STOP.
253 */
255
256/* A list to maintain subtransactions, if any. */
258
262
263/*
264 * Returns true if it is OK to start a parallel apply worker, false otherwise.
265 */
266static bool
268{
269 /* Only leader apply workers can start parallel apply workers. */
271 return false;
272
273 /*
274 * It is good to check for any change in the subscription parameter to
275 * avoid the case where for a very long time the change doesn't get
276 * reflected. This can happen when there is a constant flow of streaming
277 * transactions that are handled by parallel apply workers.
278 *
279 * It is better to do it before the below checks so that the latest values
280 * of subscription can be used for the checks.
281 */
283
284 /*
285 * Don't start a new parallel apply worker if the subscription is not
286 * using parallel streaming mode, or if the publisher does not support
287 * parallel apply.
288 */
290 return false;
291
292 /*
293 * Don't start a new parallel worker if user has set skiplsn as it's
294 * possible that they want to skip the streaming transaction. For
295 * streaming transactions, we need to serialize the transaction to a file
296 * so that we can get the last LSN of the transaction to judge whether to
297 * skip before starting to apply the change.
298 *
299 * One might think that we could allow parallelism if the first lsn of the
300 * transaction is greater than skiplsn, but we don't send it with the
301 * STREAM START message, and it doesn't seem worth sending the extra eight
302 * bytes with the STREAM START to enable parallelism for this case.
303 */
305 return false;
306
307 /*
308 * For streaming transactions that are being applied using a parallel
309 * apply worker, we cannot decide whether to apply the change for a
310 * relation that is not in the READY state (see
311 * should_apply_changes_for_rel) as we won't know remote_final_lsn by that
312 * time. So, we don't start the new parallel apply worker in this case.
313 */
314 if (!AllTablesyncsReady())
315 return false;
316
317 return true;
318}
319
320/*
321 * Set up a dynamic shared memory segment.
322 *
323 * We set up a control region that contains a fixed-size worker info
324 * (ParallelApplyWorkerShared), a message queue, and an error queue.
325 *
326 * Returns true on success, false on failure.
327 */
328static bool
330{
333 dsm_segment *seg;
334 shm_toc *toc;
336 shm_mq *mq;
337 Size queue_size = DSM_QUEUE_SIZE;
339
340 /*
341 * Estimate how much shared memory we need.
342 *
343 * Because the TOC machinery may choose to insert padding of oddly-sized
344 * requests, we must estimate each chunk separately.
345 *
346 * We need one key to register the location of the header, and two other
347 * keys to track the locations of the message queue and the error message
348 * queue.
349 */
352 shm_toc_estimate_chunk(&e, queue_size);
354
357
358 /* Create the shared memory segment and establish a table of contents. */
359 seg = dsm_create(shm_toc_estimate(&e), 0);
360 if (!seg)
361 return false;
362
364 segsize);
365
366 /* Set up the header region. */
367 shared = shm_toc_allocate(toc, sizeof(ParallelApplyWorkerShared));
368 SpinLockInit(&shared->mutex);
369
373 shared->fileset_state = FS_EMPTY;
374
376
377 /* Set up message queue for the worker. */
378 mq = shm_mq_create(shm_toc_allocate(toc, queue_size), queue_size);
381
382 /* Attach the queue. */
383 winfo->mq_handle = shm_mq_attach(mq, seg, NULL);
384
385 /* Set up error queue for the worker. */
390
391 /* Attach the queue. */
392 winfo->error_mq_handle = shm_mq_attach(mq, seg, NULL);
393
394 /* Return results to caller. */
395 winfo->dsm_seg = seg;
396 winfo->shared = shared;
397
398 return true;
399}
400
401/*
402 * Try to get a parallel apply worker from the pool. If none is available then
403 * start a new one.
404 */
407{
408 MemoryContext oldcontext;
409 bool launched;
411 ListCell *lc;
412
413 /* Try to get an available parallel apply worker from the worker pool. */
415 {
416 winfo = (ParallelApplyWorkerInfo *) lfirst(lc);
417
418 if (!winfo->in_use)
419 return winfo;
420 }
421
422 /*
423 * Start a new parallel apply worker.
424 *
425 * The worker info can be used for the lifetime of the worker process, so
426 * create it in a permanent context.
427 */
429
431
432 /* Setup shared memory. */
433 if (!pa_setup_dsm(winfo))
434 {
435 MemoryContextSwitchTo(oldcontext);
436 pfree(winfo);
437 return NULL;
438 }
439
447 false);
448
449 if (launched)
450 {
452 }
453 else
454 {
455 pa_free_worker_info(winfo);
456 winfo = NULL;
457 }
458
459 MemoryContextSwitchTo(oldcontext);
460
461 return winfo;
462}
463
464/*
465 * Allocate a parallel apply worker that will be used for the specified xid.
466 *
467 * We first try to get an available worker from the pool, if any and then try
468 * to launch a new worker. On successful allocation, remember the worker
469 * information in the hash table so that we can get it later for processing the
470 * streaming changes.
471 */
472void
474{
475 bool found;
478
479 if (!pa_can_start())
480 return;
481
483 if (!winfo)
484 return;
485
486 /* First time through, initialize parallel apply worker state hashtable. */
488 {
489 HASHCTL ctl;
490
491 MemSet(&ctl, 0, sizeof(ctl));
492 ctl.keysize = sizeof(TransactionId);
493 ctl.entrysize = sizeof(ParallelApplyWorkerEntry);
494 ctl.hcxt = ApplyContext;
495
496 ParallelApplyTxnHash = hash_create("logical replication parallel apply workers hash",
497 16, &ctl,
499 }
500
501 /* Create an entry for the requested transaction. */
502 entry = hash_search(ParallelApplyTxnHash, &xid, HASH_ENTER, &found);
503 if (found)
504 elog(ERROR, "hash table corrupted");
505
506 /* Update the transaction information in shared memory. */
507 SpinLockAcquire(&winfo->shared->mutex);
509 winfo->shared->xid = xid;
510 SpinLockRelease(&winfo->shared->mutex);
511
512 winfo->in_use = true;
513 winfo->serialize_changes = false;
514 entry->winfo = winfo;
515}
516
517/*
518 * Find the assigned worker for the given transaction, if any.
519 */
522{
523 bool found;
525
526 if (!TransactionIdIsValid(xid))
527 return NULL;
528
530 return NULL;
531
532 /* Return the cached parallel apply worker if valid. */
534 return stream_apply_worker;
535
536 /* Find an entry for the requested transaction. */
537 entry = hash_search(ParallelApplyTxnHash, &xid, HASH_FIND, &found);
538 if (found)
539 {
540 /* The worker must not have exited. */
541 Assert(entry->winfo->in_use);
542 return entry->winfo;
543 }
544
545 return NULL;
546}
547
548/*
549 * Makes the worker available for reuse.
550 *
551 * This removes the parallel apply worker entry from the hash table so that it
552 * can't be used. If there are enough workers in the pool, it stops the worker
553 * and frees the corresponding info. Otherwise it just marks the worker as
554 * available for reuse.
555 *
556 * For more information about the worker pool, see comments atop this file.
557 */
558static void
560{
562 Assert(winfo->in_use);
564
566 elog(ERROR, "hash table corrupted");
567
568 /*
569 * Stop the worker if there are enough workers in the pool.
570 *
571 * XXX Additionally, we also stop the worker if the leader apply worker
572 * serialize part of the transaction data due to a send timeout. This is
573 * because the message could be partially written to the queue and there
574 * is no way to clean the queue other than resending the message until it
575 * succeeds. Instead of trying to send the data which anyway would have
576 * been serialized and then letting the parallel apply worker deal with
577 * the spurious message, we stop the worker.
578 */
579 if (winfo->serialize_changes ||
582 {
584 pa_free_worker_info(winfo);
585
586 return;
587 }
588
589 winfo->in_use = false;
590 winfo->serialize_changes = false;
591}
592
593/*
594 * Free the parallel apply worker information and unlink the files with
595 * serialized changes if any.
596 */
597static void
599{
600 Assert(winfo);
601
602 if (winfo->mq_handle)
603 shm_mq_detach(winfo->mq_handle);
604
605 if (winfo->error_mq_handle)
607
608 /* Unlink the files with serialized changes. */
609 if (winfo->serialize_changes)
611
612 if (winfo->dsm_seg)
613 dsm_detach(winfo->dsm_seg);
614
615 /* Remove from the worker pool. */
617
618 pfree(winfo);
619}
620
621/*
622 * Detach the error queue for all parallel apply workers.
623 */
624void
626{
627 ListCell *lc;
628
630 {
632
633 if (winfo->error_mq_handle)
634 {
636 winfo->error_mq_handle = NULL;
637 }
638 }
639}
640
641/*
642 * Check if there are any pending spooled messages.
643 */
644static bool
646{
647 PartialFileSetState fileset_state;
648
649 fileset_state = pa_get_fileset_state();
650
651 return (fileset_state != FS_EMPTY);
652}
653
654/*
655 * Replay the spooled messages once the leader apply worker has finished
656 * serializing changes to the file.
657 *
658 * Returns false if there aren't any pending spooled messages, true otherwise.
659 */
660static bool
662{
663 PartialFileSetState fileset_state;
664
665 fileset_state = pa_get_fileset_state();
666
667 if (fileset_state == FS_EMPTY)
668 return false;
669
670 /*
671 * If the leader apply worker is busy serializing the partial changes then
672 * acquire the stream lock now and wait for the leader worker to finish
673 * serializing the changes. Otherwise, the parallel apply worker won't get
674 * a chance to receive a STREAM_STOP (and acquire the stream lock) until
675 * the leader had serialized all changes which can lead to undetected
676 * deadlock.
677 *
678 * Note that the fileset state can be FS_SERIALIZE_DONE once the leader
679 * worker has finished serializing the changes.
680 */
681 if (fileset_state == FS_SERIALIZE_IN_PROGRESS)
682 {
685
686 fileset_state = pa_get_fileset_state();
687 }
688
689 /*
690 * We cannot read the file immediately after the leader has serialized all
691 * changes to the file because there may still be messages in the memory
692 * queue. We will apply all spooled messages the next time we call this
693 * function and that will ensure there are no messages left in the memory
694 * queue.
695 */
696 if (fileset_state == FS_SERIALIZE_DONE)
697 {
699 }
700 else if (fileset_state == FS_READY)
701 {
706 }
707
708 return true;
709}
710
711/*
712 * Interrupt handler for main loop of parallel apply worker.
713 */
714static void
716{
718
720 {
721 ereport(LOG,
722 (errmsg("logical replication parallel apply worker for subscription \"%s\" has finished",
724
725 proc_exit(0);
726 }
727
729 {
730 ConfigReloadPending = false;
732 }
733}
734
735/* Parallel apply worker main loop. */
736static void
738{
740 ErrorContextCallback errcallback;
742
743 /*
744 * Init the ApplyMessageContext which we clean up after each replication
745 * protocol message.
746 */
748 "ApplyMessageContext",
750
751 /*
752 * Push apply error context callback. Fields will be filled while applying
753 * a change.
754 */
755 errcallback.callback = apply_error_callback;
756 errcallback.previous = error_context_stack;
757 error_context_stack = &errcallback;
758
759 for (;;)
760 {
761 void *data;
762 Size len;
763
765
766 /* Ensure we are reading the data into our memory context. */
768
769 shmq_res = shm_mq_receive(mqh, &len, &data, true);
770
772 {
774 int c;
775
776 if (len == 0)
777 elog(ERROR, "invalid message length");
778
780
781 /*
782 * The first byte of messages sent from leader apply worker to
783 * parallel apply workers can only be PqReplMsg_WALData.
784 */
785 c = pq_getmsgbyte(&s);
786 if (c != PqReplMsg_WALData)
787 elog(ERROR, "unexpected message \"%c\"", c);
788
789 /*
790 * Ignore statistics fields that have been updated by the leader
791 * apply worker.
792 *
793 * XXX We can avoid sending the statistics fields from the leader
794 * apply worker but for that, it needs to rebuild the entire
795 * message by removing these fields which could be more work than
796 * simply ignoring these fields in the parallel apply worker.
797 */
799
800 apply_dispatch(&s);
801 }
802 else if (shmq_res == SHM_MQ_WOULD_BLOCK)
803 {
804 /* Replay the changes from the file, if any. */
806 {
807 int rc;
808
809 /* Wait for more work. */
810 rc = WaitLatch(MyLatch,
812 1000L,
814
815 if (rc & WL_LATCH_SET)
817 }
818 }
819 else
820 {
822
825 errmsg("lost connection to the logical replication apply worker")));
826 }
827
830 }
831
832 /* Pop the error context stack. */
833 error_context_stack = errcallback.previous;
834
836}
837
838/*
839 * Make sure the leader apply worker tries to read from our error queue one more
840 * time. This guards against the case where we exit uncleanly without sending
841 * an ErrorResponse, for example because some code calls proc_exit directly.
842 *
843 * Also explicitly detach from dsm segment to invoke on_dsm_detach callbacks,
844 * if any. See ParallelWorkerShutdown for details.
845 */
846static void
855
856/*
857 * Parallel apply worker entry point.
858 */
859void
861{
863 dsm_handle handle;
864 dsm_segment *seg;
865 shm_toc *toc;
866 shm_mq *mq;
868 shm_mq_handle *error_mqh;
872
874
875 /*
876 * Setup signal handling.
877 *
878 * Note: We intentionally used SIGUSR2 to trigger a graceful shutdown
879 * initiated by the leader apply worker. This helps to differentiate it
880 * from the case where we abort the current transaction and exit on
881 * receiving SIGTERM.
882 */
886
887 /*
888 * Attach to the dynamic shared memory segment for the parallel apply, and
889 * find its table of contents.
890 *
891 * Like parallel query, we don't need resource owner by this time. See
892 * ParallelWorkerMain.
893 */
894 memcpy(&handle, MyBgworkerEntry->bgw_extra, sizeof(dsm_handle));
895 seg = dsm_attach(handle);
896 if (!seg)
899 errmsg("could not map dynamic shared memory segment")));
900
902 if (!toc)
905 errmsg("invalid magic number in dynamic shared memory segment")));
906
907 /* Look up the shared information. */
908 shared = shm_toc_lookup(toc, PARALLEL_APPLY_KEY_SHARED, false);
909 MyParallelShared = shared;
910
911 /*
912 * Attach to the message queue.
913 */
916 mqh = shm_mq_attach(mq, seg, NULL);
917
918 /*
919 * Primary initialization is complete. Now, we can attach to our slot.
920 * This is to ensure that the leader apply worker does not write data to
921 * the uninitialized memory queue.
922 */
924
925 /*
926 * Register the shutdown callback after we are attached to the worker
927 * slot. This is to ensure that MyLogicalRepWorker remains valid when this
928 * callback is invoked.
929 */
931
936
937 /*
938 * Attach to the error queue.
939 */
942 error_mqh = shm_mq_attach(mq, seg, NULL);
943
944 pq_redirect_to_shm_mq(seg, error_mqh);
947
950
952
954
955 /* Setup replication origin tracking. */
958 originname, sizeof(originname));
960
961 /*
962 * The parallel apply worker doesn't need to monopolize this replication
963 * origin which was already acquired by its leader process.
964 */
968
969 /*
970 * Setup callback for syscache so that we know when something changes in
971 * the subscription relation state.
972 */
975 (Datum) 0);
976
978
980
981 /*
982 * The parallel apply worker must not get here because the parallel apply
983 * worker will only stop when it receives a SIGTERM or SIGUSR2 from the
984 * leader, or SIGINT from itself, or when there is an error. None of these
985 * cases will allow the code to reach here.
986 */
987 Assert(false);
988}
989
990/*
991 * Handle receipt of an interrupt indicating a parallel apply worker message.
992 *
993 * Note: this is called within a signal handler! All we can do is set a flag
994 * that will cause the next CHECK_FOR_INTERRUPTS() to invoke
995 * ProcessParallelApplyMessages().
996 */
997void
1004
1005/*
1006 * Process a single protocol message received from a single parallel apply
1007 * worker.
1008 */
1009static void
1011{
1012 char msgtype;
1013
1014 msgtype = pq_getmsgbyte(msg);
1015
1016 switch (msgtype)
1017 {
1019 {
1021
1022 /* Parse ErrorResponse. */
1024
1025 /*
1026 * If desired, add a context line to show that this is a
1027 * message propagated from a parallel apply worker. Otherwise,
1028 * it can sometimes be confusing to understand what actually
1029 * happened.
1030 */
1031 if (edata.context)
1032 edata.context = psprintf("%s\n%s", edata.context,
1033 _("logical replication parallel apply worker"));
1034 else
1035 edata.context = pstrdup(_("logical replication parallel apply worker"));
1036
1037 /*
1038 * Context beyond that should use the error context callbacks
1039 * that were in effect in LogicalRepApplyLoop().
1040 */
1042
1043 /*
1044 * The actual error must have been reported by the parallel
1045 * apply worker.
1046 */
1047 ereport(ERROR,
1049 errmsg("logical replication parallel apply worker exited due to error"),
1050 errcontext("%s", edata.context)));
1051 }
1052
1053 /*
1054 * Don't need to do anything about NoticeResponse and
1055 * NotificationResponse as the logical replication worker doesn't
1056 * need to send messages to the client.
1057 */
1060 break;
1061
1062 default:
1063 elog(ERROR, "unrecognized message type received from logical replication parallel apply worker: %c (message length %d bytes)",
1064 msgtype, msg->len);
1065 }
1066}
1067
1068/*
1069 * Handle any queued protocol messages received from parallel apply workers.
1070 */
1071void
1073{
1074 ListCell *lc;
1075 MemoryContext oldcontext;
1076
1078
1079 /*
1080 * This is invoked from ProcessInterrupts(), and since some of the
1081 * functions it calls contain CHECK_FOR_INTERRUPTS(), there is a potential
1082 * for recursive calls if more signals are received while this runs. It's
1083 * unclear that recursive entry would be safe, and it doesn't seem useful
1084 * even if it is safe, so let's block interrupts until done.
1085 */
1087
1088 /*
1089 * Moreover, CurrentMemoryContext might be pointing almost anywhere. We
1090 * don't want to risk leaking data into long-lived contexts, so let's do
1091 * our work here in a private context that we can reset on each use.
1092 */
1093 if (!hpam_context) /* first time through? */
1095 "ProcessParallelApplyMessages",
1097 else
1099
1100 oldcontext = MemoryContextSwitchTo(hpam_context);
1101
1103
1104 foreach(lc, ParallelApplyWorkerPool)
1105 {
1106 shm_mq_result res;
1107 Size nbytes;
1108 void *data;
1110
1111 /*
1112 * The leader will detach from the error queue and set it to NULL
1113 * before preparing to stop all parallel apply workers, so we don't
1114 * need to handle error messages anymore. See
1115 * logicalrep_worker_detach.
1116 */
1117 if (!winfo->error_mq_handle)
1118 continue;
1119
1120 res = shm_mq_receive(winfo->error_mq_handle, &nbytes, &data, true);
1121
1122 if (res == SHM_MQ_WOULD_BLOCK)
1123 continue;
1124 else if (res == SHM_MQ_SUCCESS)
1125 {
1126 StringInfoData msg;
1127
1128 initStringInfo(&msg);
1129 appendBinaryStringInfo(&msg, data, nbytes);
1131 pfree(msg.data);
1132 }
1133 else
1134 ereport(ERROR,
1136 errmsg("lost connection to the logical replication parallel apply worker")));
1137 }
1138
1139 MemoryContextSwitchTo(oldcontext);
1140
1141 /* Might as well clear the context on our way out */
1143
1145}
1146
1147/*
1148 * Send the data to the specified parallel apply worker via shared-memory
1149 * queue.
1150 *
1151 * Returns false if the attempt to send data via shared memory times out, true
1152 * otherwise.
1153 */
1154bool
1155pa_send_data(ParallelApplyWorkerInfo *winfo, Size nbytes, const void *data)
1156{
1157 int rc;
1158 shm_mq_result result;
1159 TimestampTz startTime = 0;
1160
1162 Assert(!winfo->serialize_changes);
1163
1164 /*
1165 * We don't try to send data to parallel worker for 'immediate' mode. This
1166 * is primarily used for testing purposes.
1167 */
1169 return false;
1170
1171/*
1172 * This timeout is a bit arbitrary but testing revealed that it is sufficient
1173 * to send the message unless the parallel apply worker is waiting on some
1174 * lock or there is a serious resource crunch. See the comments atop this file
1175 * to know why we are using a non-blocking way to send the message.
1176 */
1177#define SHM_SEND_RETRY_INTERVAL_MS 1000
1178#define SHM_SEND_TIMEOUT_MS (10000 - SHM_SEND_RETRY_INTERVAL_MS)
1179
1180 for (;;)
1181 {
1182 result = shm_mq_send(winfo->mq_handle, nbytes, data, true, true);
1183
1184 if (result == SHM_MQ_SUCCESS)
1185 return true;
1186 else if (result == SHM_MQ_DETACHED)
1187 ereport(ERROR,
1189 errmsg("could not send data to shared-memory queue")));
1190
1191 Assert(result == SHM_MQ_WOULD_BLOCK);
1192
1193 /* Wait before retrying. */
1194 rc = WaitLatch(MyLatch,
1198
1199 if (rc & WL_LATCH_SET)
1200 {
1203 }
1204
1205 if (startTime == 0)
1206 startTime = GetCurrentTimestamp();
1207 else if (TimestampDifferenceExceeds(startTime, GetCurrentTimestamp(),
1209 return false;
1210 }
1211}
1212
1213/*
1214 * Switch to PARTIAL_SERIALIZE mode for the current transaction -- this means
1215 * that the current data and any subsequent data for this transaction will be
1216 * serialized to a file. This is done to prevent possible deadlocks with
1217 * another parallel apply worker (refer to the comments atop this file).
1218 */
1219void
1221 bool stream_locked)
1222{
1223 ereport(LOG,
1224 (errmsg("logical replication apply worker will serialize the remaining changes of remote transaction %u to a file",
1225 winfo->shared->xid)));
1226
1227 /*
1228 * The parallel apply worker could be stuck for some reason (say waiting
1229 * on some lock by other backend), so stop trying to send data directly to
1230 * it and start serializing data to the file instead.
1231 */
1232 winfo->serialize_changes = true;
1233
1234 /* Initialize the stream fileset. */
1235 stream_start_internal(winfo->shared->xid, true);
1236
1237 /*
1238 * Acquires the stream lock if not already to make sure that the parallel
1239 * apply worker will wait for the leader to release the stream lock until
1240 * the end of the transaction.
1241 */
1242 if (!stream_locked)
1244
1246}
1247
1248/*
1249 * Wait until the parallel apply worker's transaction state has reached or
1250 * exceeded the given xact_state.
1251 */
1252static void
1254 ParallelTransState xact_state)
1255{
1256 for (;;)
1257 {
1258 /*
1259 * Stop if the transaction state has reached or exceeded the given
1260 * xact_state.
1261 */
1262 if (pa_get_xact_state(winfo->shared) >= xact_state)
1263 break;
1264
1265 /* Wait to be signalled. */
1268 10L,
1270
1271 /* Reset the latch so we don't spin. */
1273
1274 /* An interrupt may have occurred while we were waiting. */
1276 }
1277}
1278
1279/*
1280 * Wait until the parallel apply worker's transaction finishes.
1281 */
1282static void
1284{
1285 /*
1286 * Wait until the parallel apply worker set the state to
1287 * PARALLEL_TRANS_STARTED which means it has acquired the transaction
1288 * lock. This is to prevent leader apply worker from acquiring the
1289 * transaction lock earlier than the parallel apply worker.
1290 */
1292
1293 /*
1294 * Wait for the transaction lock to be released. This is required to
1295 * detect deadlock among leader and parallel apply workers. Refer to the
1296 * comments atop this file.
1297 */
1300
1301 /*
1302 * Check if the state becomes PARALLEL_TRANS_FINISHED in case the parallel
1303 * apply worker failed while applying changes causing the lock to be
1304 * released.
1305 */
1307 ereport(ERROR,
1309 errmsg("lost connection to the logical replication parallel apply worker")));
1310}
1311
1312/*
1313 * Set the transaction state for a given parallel apply worker.
1314 */
1315void
1317 ParallelTransState xact_state)
1318{
1319 SpinLockAcquire(&wshared->mutex);
1320 wshared->xact_state = xact_state;
1321 SpinLockRelease(&wshared->mutex);
1322}
1323
1324/*
1325 * Get the transaction state for a given parallel apply worker.
1326 */
1327static ParallelTransState
1329{
1330 ParallelTransState xact_state;
1331
1332 SpinLockAcquire(&wshared->mutex);
1333 xact_state = wshared->xact_state;
1334 SpinLockRelease(&wshared->mutex);
1335
1336 return xact_state;
1337}
1338
1339/*
1340 * Cache the parallel apply worker information.
1341 */
1342void
1347
1348/*
1349 * Form a unique savepoint name for the streaming transaction.
1350 *
1351 * Note that different subscriptions for publications on different nodes can
1352 * receive same remote xid, so we need to use subscription id along with it.
1353 *
1354 * Returns the name in the supplied buffer.
1355 */
1356static void
1358{
1359 snprintf(spname, szsp, "pg_sp_%u_%u", suboid, xid);
1360}
1361
1362/*
1363 * Define a savepoint for a subxact in parallel apply worker if needed.
1364 *
1365 * The parallel apply worker can figure out if a new subtransaction was
1366 * started by checking if the new change arrived with a different xid. In that
1367 * case define a named savepoint, so that we are able to rollback to it
1368 * if required.
1369 */
1370void
1372{
1373 if (current_xid != top_xid &&
1375 {
1377 char spname[NAMEDATALEN];
1378
1380 spname, sizeof(spname));
1381
1382 elog(DEBUG1, "defining savepoint %s in logical replication parallel apply worker", spname);
1383
1384 /* We must be in transaction block to define the SAVEPOINT. */
1385 if (!IsTransactionBlock())
1386 {
1387 if (!IsTransactionState())
1389
1392 }
1393
1395
1396 /*
1397 * CommitTransactionCommand is needed to start a subtransaction after
1398 * issuing a SAVEPOINT inside a transaction block (see
1399 * StartSubTransaction()).
1400 */
1402
1406 }
1407}
1408
1409/* Reset the list that maintains subtransactions. */
1410void
1412{
1413 /*
1414 * We don't need to free this explicitly as the allocated memory will be
1415 * freed at the transaction end.
1416 */
1417 subxactlist = NIL;
1418}
1419
1420/*
1421 * Handle STREAM ABORT message when the transaction was applied in a parallel
1422 * apply worker.
1423 */
1424void
1426{
1427 TransactionId xid = abort_data->xid;
1428 TransactionId subxid = abort_data->subxid;
1429
1430 /*
1431 * Update origin state so we can restart streaming from correct position
1432 * in case of crash.
1433 */
1436
1437 /*
1438 * If the two XIDs are the same, it's in fact abort of toplevel xact, so
1439 * just free the subxactlist.
1440 */
1441 if (subxid == xid)
1442 {
1444
1445 /*
1446 * Release the lock as we might be processing an empty streaming
1447 * transaction in which case the lock won't be released during
1448 * transaction rollback.
1449 *
1450 * Note that it's ok to release the transaction lock before aborting
1451 * the transaction because even if the parallel apply worker dies due
1452 * to crash or some other reason, such a transaction would still be
1453 * considered aborted.
1454 */
1456
1458
1459 if (IsTransactionBlock())
1460 {
1461 EndTransactionBlock(false);
1463 }
1464
1466
1468 }
1469 else
1470 {
1471 /* OK, so it's a subxact. Rollback to the savepoint. */
1472 int i;
1473 char spname[NAMEDATALEN];
1474
1475 pa_savepoint_name(MySubscription->oid, subxid, spname, sizeof(spname));
1476
1477 elog(DEBUG1, "rolling back to savepoint %s in logical replication parallel apply worker", spname);
1478
1479 /*
1480 * Search the subxactlist, determine the offset tracked for the
1481 * subxact, and truncate the list.
1482 *
1483 * Note that for an empty sub-transaction we won't find the subxid
1484 * here.
1485 */
1486 for (i = list_length(subxactlist) - 1; i >= 0; i--)
1487 {
1489
1490 if (xid_tmp == subxid)
1491 {
1495 break;
1496 }
1497 }
1498 }
1499}
1500
1501/*
1502 * Set the fileset state for a particular parallel apply worker. The fileset
1503 * will be set once the leader worker serialized all changes to the file
1504 * so that it can be used by parallel apply worker.
1505 */
1506void
1508 PartialFileSetState fileset_state)
1509{
1510 SpinLockAcquire(&wshared->mutex);
1511 wshared->fileset_state = fileset_state;
1512
1513 if (fileset_state == FS_SERIALIZE_DONE)
1514 {
1518 }
1519
1520 SpinLockRelease(&wshared->mutex);
1521}
1522
1523/*
1524 * Get the fileset state for the current parallel apply worker.
1525 */
1528{
1529 PartialFileSetState fileset_state;
1530
1532
1534 fileset_state = MyParallelShared->fileset_state;
1536
1537 return fileset_state;
1538}
1539
1540/*
1541 * Helper functions to acquire and release a lock for each stream block.
1542 *
1543 * Set locktag_field4 to PARALLEL_APPLY_LOCK_STREAM to indicate that it's a
1544 * stream lock.
1545 *
1546 * Refer to the comments atop this file to see how the stream lock is used.
1547 */
1548void
1554
1555void
1561
1562/*
1563 * Helper functions to acquire and release a lock for each local transaction
1564 * apply.
1565 *
1566 * Set locktag_field4 to PARALLEL_APPLY_LOCK_XACT to indicate that it's a
1567 * transaction lock.
1568 *
1569 * Note that all the callers must pass a remote transaction ID instead of a
1570 * local transaction ID as xid. This is because the local transaction ID will
1571 * only be assigned while applying the first change in the parallel apply but
1572 * it's possible that the first change in the parallel apply worker is blocked
1573 * by a concurrently executing transaction in another parallel apply worker. We
1574 * can only communicate the local transaction id to the leader after applying
1575 * the first change so it won't be able to wait after sending the xact finish
1576 * command using this lock.
1577 *
1578 * Refer to the comments atop this file to see how the transaction lock is
1579 * used.
1580 */
1581void
1587
1588void
1594
1595/*
1596 * Decrement the number of pending streaming blocks and wait on the stream lock
1597 * if there is no pending block available.
1598 */
1599void
1601{
1603
1604 /*
1605 * It is only possible to not have any pending stream chunks when we are
1606 * applying spooled messages.
1607 */
1609 {
1611 return;
1612
1613 elog(ERROR, "invalid pending streaming chunk 0");
1614 }
1615
1617 {
1620 }
1621}
1622
1623/*
1624 * Finish processing the streaming transaction in the leader apply worker.
1625 */
1626void
1628{
1630
1631 /*
1632 * Unlock the shared object lock so that parallel apply worker can
1633 * continue to receive and apply changes.
1634 */
1636
1637 /*
1638 * Wait for that worker to finish. This is necessary to maintain commit
1639 * order which avoids failures due to transaction dependencies and
1640 * deadlocks.
1641 */
1643
1644 if (XLogRecPtrIsValid(remote_lsn))
1645 store_flush_position(remote_lsn, winfo->shared->last_commit_end);
1646
1647 pa_free_worker(winfo);
1648}
static ParallelApplyWorkerInfo * stream_apply_worker
static List * ParallelApplyWorkerPool
void pa_set_xact_state(ParallelApplyWorkerShared *wshared, ParallelTransState xact_state)
void pa_unlock_stream(TransactionId xid, LOCKMODE lockmode)
static bool pa_setup_dsm(ParallelApplyWorkerInfo *winfo)
#define DSM_ERROR_QUEUE_SIZE
volatile sig_atomic_t ParallelApplyMessagePending
static bool pa_can_start(void)
void HandleParallelApplyMessageInterrupt(void)
void ProcessParallelApplyMessages(void)
#define SHM_SEND_TIMEOUT_MS
#define DSM_QUEUE_SIZE
static void pa_savepoint_name(Oid suboid, TransactionId xid, char *spname, Size szsp)
void pa_stream_abort(LogicalRepStreamAbortData *abort_data)
static void ProcessParallelApplyInterrupts(void)
static void ProcessParallelApplyMessage(StringInfo msg)
static PartialFileSetState pa_get_fileset_state(void)
static void pa_free_worker_info(ParallelApplyWorkerInfo *winfo)
#define PARALLEL_APPLY_LOCK_XACT
void pa_lock_stream(TransactionId xid, LOCKMODE lockmode)
static List * subxactlist
static void pa_shutdown(int code, Datum arg)
void pa_set_fileset_state(ParallelApplyWorkerShared *wshared, PartialFileSetState fileset_state)
void pa_reset_subtrans(void)
static ParallelTransState pa_get_xact_state(ParallelApplyWorkerShared *wshared)
#define PARALLEL_APPLY_KEY_SHARED
void pa_lock_transaction(TransactionId xid, LOCKMODE lockmode)
ParallelApplyWorkerShared * MyParallelShared
void pa_detach_all_error_mq(void)
static bool pa_has_spooled_message_pending(void)
static void LogicalParallelApplyLoop(shm_mq_handle *mqh)
static void pa_wait_for_xact_state(ParallelApplyWorkerInfo *winfo, ParallelTransState xact_state)
void pa_start_subtrans(TransactionId current_xid, TransactionId top_xid)
#define PARALLEL_APPLY_KEY_ERROR_QUEUE
void pa_switch_to_partial_serialize(ParallelApplyWorkerInfo *winfo, bool stream_locked)
static void pa_free_worker(ParallelApplyWorkerInfo *winfo)
void pa_xact_finish(ParallelApplyWorkerInfo *winfo, XLogRecPtr remote_lsn)
#define PARALLEL_APPLY_KEY_MQ
static void pa_wait_for_xact_finish(ParallelApplyWorkerInfo *winfo)
#define SIZE_STATS_MESSAGE
#define SHM_SEND_RETRY_INTERVAL_MS
bool pa_send_data(ParallelApplyWorkerInfo *winfo, Size nbytes, const void *data)
void pa_allocate_worker(TransactionId xid)
static bool pa_process_spooled_messages_if_required(void)
void pa_set_stream_apply_worker(ParallelApplyWorkerInfo *winfo)
static HTAB * ParallelApplyTxnHash
#define PARALLEL_APPLY_LOCK_STREAM
ParallelApplyWorkerInfo * pa_find_worker(TransactionId xid)
void pa_unlock_transaction(TransactionId xid, LOCKMODE lockmode)
static ParallelApplyWorkerInfo * pa_launch_parallel_worker(void)
void ParallelApplyWorkerMain(Datum main_arg)
#define PG_LOGICAL_APPLY_SHM_MAGIC
void pa_decr_and_wait_stream_block(void)
static uint32 pg_atomic_sub_fetch_u32(volatile pg_atomic_uint32 *ptr, int32 sub_)
Definition atomics.h:439
static void pg_atomic_init_u32(volatile pg_atomic_uint32 *ptr, uint32 val)
Definition atomics.h:219
static uint32 pg_atomic_read_u32(volatile pg_atomic_uint32 *ptr)
Definition atomics.h:237
void stream_cleanup_files(Oid subid, TransactionId xid)
Definition worker.c:5420
MemoryContext ApplyMessageContext
Definition worker.c:472
bool InitializingApplyWorker
Definition worker.c:500
void apply_dispatch(StringInfo s)
Definition worker.c:3778
void ReplicationOriginNameForLogicalRep(Oid suboid, Oid relid, char *originname, Size szoriginname)
Definition worker.c:644
ErrorContextCallback * apply_error_context_stack
Definition worker.c:470
void stream_start_internal(TransactionId xid, bool first_segment)
Definition worker.c:1690
void set_apply_error_context_origin(char *originname)
Definition worker.c:6342
MemoryContext ApplyContext
Definition worker.c:473
void apply_error_callback(void *arg)
Definition worker.c:6200
void store_flush_position(XLogRecPtr remote_lsn, XLogRecPtr local_lsn)
Definition worker.c:3942
void maybe_reread_subscription(void)
Definition worker.c:5041
void InitializeLogRepWorker(void)
Definition worker.c:5775
void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid, XLogRecPtr lsn)
Definition worker.c:2263
Subscription * MySubscription
Definition worker.c:480
bool TimestampDifferenceExceeds(TimestampTz start_time, TimestampTz stop_time, int msec)
Definition timestamp.c:1781
TimestampTz GetCurrentTimestamp(void)
Definition timestamp.c:1645
void pgstat_report_activity(BackendState state, const char *cmd_str)
@ STATE_IDLE
void BackgroundWorkerUnblockSignals(void)
Definition bgworker.c:933
#define Assert(condition)
Definition c.h:885
#define unlikely(x)
Definition c.h:424
#define MemSet(start, val, len)
Definition c.h:1035
uint32 TransactionId
Definition c.h:678
size_t Size
Definition c.h:631
int64 TimestampTz
Definition timestamp.h:39
dsm_handle dsm_segment_handle(dsm_segment *seg)
Definition dsm.c:1123
void dsm_detach(dsm_segment *seg)
Definition dsm.c:803
void * dsm_segment_address(dsm_segment *seg)
Definition dsm.c:1095
dsm_segment * dsm_create(Size size, int flags)
Definition dsm.c:516
dsm_segment * dsm_attach(dsm_handle h)
Definition dsm.c:665
uint32 dsm_handle
Definition dsm_impl.h:55
void * hash_search(HTAB *hashp, const void *keyPtr, HASHACTION action, bool *foundPtr)
Definition dynahash.c:952
HTAB * hash_create(const char *tabname, int64 nelem, const HASHCTL *info, int flags)
Definition dynahash.c:358
Datum arg
Definition elog.c:1322
ErrorContextCallback * error_context_stack
Definition elog.c:99
int errcode(int sqlerrcode)
Definition elog.c:874
int errmsg(const char *fmt,...)
Definition elog.c:1093
#define _(x)
Definition elog.c:95
#define LOG
Definition elog.h:31
#define errcontext
Definition elog.h:198
#define DEBUG1
Definition elog.h:30
#define ERROR
Definition elog.h:39
#define elog(elevel,...)
Definition elog.h:226
#define ereport(elevel,...)
Definition elog.h:150
#define palloc0_object(type)
Definition fe_memutils.h:75
volatile sig_atomic_t InterruptPending
Definition globals.c:32
struct Latch * MyLatch
Definition globals.c:63
void ProcessConfigFile(GucContext context)
Definition guc-file.l:120
@ PGC_SIGHUP
Definition guc.h:75
@ HASH_FIND
Definition hsearch.h:113
@ HASH_REMOVE
Definition hsearch.h:115
@ HASH_ENTER
Definition hsearch.h:114
#define HASH_CONTEXT
Definition hsearch.h:102
#define HASH_ELEM
Definition hsearch.h:95
#define HASH_BLOBS
Definition hsearch.h:97
void SignalHandlerForShutdownRequest(SIGNAL_ARGS)
Definition interrupt.c:104
volatile sig_atomic_t ShutdownRequestPending
Definition interrupt.c:28
volatile sig_atomic_t ConfigReloadPending
Definition interrupt.c:27
void SignalHandlerForConfigReload(SIGNAL_ARGS)
Definition interrupt.c:61
void CacheRegisterSyscacheCallback(SysCacheIdentifier cacheid, SyscacheCallbackFunction func, Datum arg)
Definition inval.c:1816
void before_shmem_exit(pg_on_exit_callback function, Datum arg)
Definition ipc.c:344
void proc_exit(int code)
Definition ipc.c:105
int i
Definition isn.c:77
void SetLatch(Latch *latch)
Definition latch.c:290
void ResetLatch(Latch *latch)
Definition latch.c:374
int WaitLatch(Latch *latch, int wakeEvents, long timeout, uint32 wait_event_info)
Definition latch.c:172
bool logicalrep_worker_launch(LogicalRepWorkerType wtype, Oid dbid, Oid subid, const char *subname, Oid userid, Oid relid, dsm_handle subworker_dsm, bool retain_dead_tuples)
Definition launcher.c:324
void logicalrep_worker_attach(int slot)
Definition launcher.c:757
void logicalrep_pa_worker_stop(ParallelApplyWorkerInfo *winfo)
Definition launcher.c:679
LogicalRepWorker * MyLogicalRepWorker
Definition launcher.c:56
int max_parallel_apply_workers_per_subscription
Definition launcher.c:54
List * list_delete_ptr(List *list, void *datum)
Definition list.c:872
List * lappend(List *list, void *datum)
Definition list.c:339
List * lappend_xid(List *list, TransactionId datum)
Definition list.c:393
bool list_member_xid(const List *list, TransactionId datum)
Definition list.c:742
List * list_truncate(List *list, int new_size)
Definition list.c:631
void UnlockApplyTransactionForSession(Oid suboid, TransactionId xid, uint16 objid, LOCKMODE lockmode)
Definition lmgr.c:1227
void LockApplyTransactionForSession(Oid suboid, TransactionId xid, uint16 objid, LOCKMODE lockmode)
Definition lmgr.c:1209
int LOCKMODE
Definition lockdefs.h:26
#define AccessExclusiveLock
Definition lockdefs.h:43
#define AccessShareLock
Definition lockdefs.h:36
void MemoryContextReset(MemoryContext context)
Definition mcxt.c:403
MemoryContext TopTransactionContext
Definition mcxt.c:171
char * pstrdup(const char *in)
Definition mcxt.c:1781
void pfree(void *pointer)
Definition mcxt.c:1616
MemoryContext TopMemoryContext
Definition mcxt.c:166
MemoryContext CurrentMemoryContext
Definition mcxt.c:160
#define AllocSetContextCreate
Definition memutils.h:129
#define ALLOCSET_DEFAULT_SIZES
Definition memutils.h:160
#define RESUME_INTERRUPTS()
Definition miscadmin.h:136
#define CHECK_FOR_INTERRUPTS()
Definition miscadmin.h:123
#define HOLD_INTERRUPTS()
Definition miscadmin.h:134
ReplOriginXactState replorigin_xact_state
Definition origin.c:166
ReplOriginId replorigin_by_name(const char *roname, bool missing_ok)
Definition origin.c:231
void replorigin_session_setup(ReplOriginId node, int acquired_by)
Definition origin.c:1146
static MemoryContext MemoryContextSwitchTo(MemoryContext context)
Definition palloc.h:124
#define NAMEDATALEN
const void size_t len
const void * data
#define lfirst(lc)
Definition pg_list.h:172
static int list_length(const List *l)
Definition pg_list.h:152
#define NIL
Definition pg_list.h:68
static ListCell * list_nth_cell(const List *list, int n)
Definition pg_list.h:277
#define lfirst_xid(lc)
Definition pg_list.h:175
#define pqsignal
Definition port.h:547
#define snprintf
Definition port.h:260
static Datum PointerGetDatum(const void *X)
Definition postgres.h:352
uint64_t Datum
Definition postgres.h:70
static Pointer DatumGetPointer(Datum X)
Definition postgres.h:342
static int32 DatumGetInt32(Datum X)
Definition postgres.h:212
#define InvalidOid
unsigned int Oid
BackgroundWorker * MyBgworkerEntry
Definition postmaster.c:200
int pq_getmsgbyte(StringInfo msg)
Definition pqformat.c:398
void pq_set_parallel_leader(pid_t pid, ProcNumber procNumber)
Definition pqmq.c:83
void pq_parse_errornotice(StringInfo msg, ErrorData *edata)
Definition pqmq.c:223
void pq_redirect_to_shm_mq(dsm_segment *seg, shm_mq_handle *mqh)
Definition pqmq.c:54
char * c
e
static int fb(int x)
#define INVALID_PROC_NUMBER
Definition procnumber.h:26
int SendProcSignal(pid_t pid, ProcSignalReason reason, ProcNumber procNumber)
Definition procsignal.c:286
@ PROCSIG_PARALLEL_APPLY_MESSAGE
Definition procsignal.h:38
#define PqReplMsg_WALData
Definition protocol.h:77
#define PqMsg_NotificationResponse
Definition protocol.h:41
#define PqMsg_ErrorResponse
Definition protocol.h:44
#define PqMsg_NoticeResponse
Definition protocol.h:49
char * psprintf(const char *fmt,...)
Definition psprintf.c:43
tree ctl
Definition radixtree.h:1838
int debug_logical_replication_streaming
@ DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE
void shm_mq_set_sender(shm_mq *mq, PGPROC *proc)
Definition shm_mq.c:225
shm_mq * shm_mq_create(void *address, Size size)
Definition shm_mq.c:178
void shm_mq_detach(shm_mq_handle *mqh)
Definition shm_mq.c:844
void shm_mq_set_receiver(shm_mq *mq, PGPROC *proc)
Definition shm_mq.c:207
shm_mq_result shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)
Definition shm_mq.c:573
shm_mq_result shm_mq_send(shm_mq_handle *mqh, Size nbytes, const void *data, bool nowait, bool force_flush)
Definition shm_mq.c:330
shm_mq_handle * shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle)
Definition shm_mq.c:291
shm_mq_result
Definition shm_mq.h:39
@ SHM_MQ_SUCCESS
Definition shm_mq.h:40
@ SHM_MQ_WOULD_BLOCK
Definition shm_mq.h:41
@ SHM_MQ_DETACHED
Definition shm_mq.h:42
void * shm_toc_allocate(shm_toc *toc, Size nbytes)
Definition shm_toc.c:88
Size shm_toc_estimate(shm_toc_estimator *e)
Definition shm_toc.c:263
shm_toc * shm_toc_create(uint64 magic, void *address, Size nbytes)
Definition shm_toc.c:40
void shm_toc_insert(shm_toc *toc, uint64 key, void *address)
Definition shm_toc.c:171
void * shm_toc_lookup(shm_toc *toc, uint64 key, bool noError)
Definition shm_toc.c:232
shm_toc * shm_toc_attach(uint64 magic, void *address)
Definition shm_toc.c:64
#define shm_toc_estimate_chunk(e, sz)
Definition shm_toc.h:51
#define shm_toc_initialize_estimator(e)
Definition shm_toc.h:49
#define shm_toc_estimate_keys(e, cnt)
Definition shm_toc.h:53
static void SpinLockRelease(volatile slock_t *lock)
Definition spin.h:62
static void SpinLockAcquire(volatile slock_t *lock)
Definition spin.h:56
static void SpinLockInit(volatile slock_t *lock)
Definition spin.h:50
PGPROC * MyProc
Definition proc.c:67
void appendBinaryStringInfo(StringInfo str, const void *data, int datalen)
Definition stringinfo.c:281
void initStringInfo(StringInfo str)
Definition stringinfo.c:97
static void initReadOnlyStringInfo(StringInfo str, char *data, int len)
Definition stringinfo.h:157
char bgw_extra[BGW_EXTRALEN]
Definition bgworker.h:106
struct ErrorContextCallback * previous
Definition elog.h:297
void(* callback)(void *arg)
Definition elog.h:298
Definition pg_list.h:54
TimestampTz last_recv_time
TimestampTz reply_time
FileSet * stream_fileset
TimestampTz last_send_time
ParallelApplyWorkerInfo * winfo
shm_mq_handle * error_mq_handle
ParallelApplyWorkerShared * shared
pg_atomic_uint32 pending_stream_count
PartialFileSetState fileset_state
ParallelTransState xact_state
ReplOriginId origin
Definition origin.h:45
XLogRecPtr origin_lsn
Definition origin.h:46
TimestampTz origin_timestamp
Definition origin.h:47
XLogRecPtr skiplsn
void InvalidateSyncingRelStates(Datum arg, SysCacheIdentifier cacheid, uint32 hashvalue)
Definition syncutils.c:101
bool AllTablesyncsReady(void)
Definition tablesync.c:1596
#define TransactionIdIsValid(xid)
Definition transam.h:41
#define WL_TIMEOUT
#define WL_EXIT_ON_PM_DEATH
#define WL_LATCH_SET
#define SIGHUP
Definition win32_port.h:158
#define SIGUSR2
Definition win32_port.h:171
ParallelTransState
@ PARALLEL_TRANS_UNKNOWN
@ PARALLEL_TRANS_STARTED
@ PARALLEL_TRANS_FINISHED
static bool am_parallel_apply_worker(void)
@ WORKERTYPE_PARALLEL_APPLY
PartialFileSetState
@ FS_EMPTY
@ FS_SERIALIZE_DONE
@ FS_READY
@ FS_SERIALIZE_IN_PROGRESS
static bool am_leader_apply_worker(void)
void DefineSavepoint(const char *name)
Definition xact.c:4395
bool IsTransactionState(void)
Definition xact.c:388
void StartTransactionCommand(void)
Definition xact.c:3080
bool IsTransactionBlock(void)
Definition xact.c:4993
void BeginTransactionBlock(void)
Definition xact.c:3946
void CommitTransactionCommand(void)
Definition xact.c:3178
void RollbackToSavepoint(const char *name)
Definition xact.c:4589
bool EndTransactionBlock(bool chain)
Definition xact.c:4066
void AbortCurrentTransaction(void)
Definition xact.c:3472
#define XLogRecPtrIsValid(r)
Definition xlogdefs.h:29
uint16 ReplOriginId
Definition xlogdefs.h:69
uint64 XLogRecPtr
Definition xlogdefs.h:21
#define InvalidXLogRecPtr
Definition xlogdefs.h:28