PostgreSQL Source Code  git master
radixtree.h
Go to the documentation of this file.
1 /*-------------------------------------------------------------------------
2  *
3  * radixtree.h
4  * Template for adaptive radix tree.
5  *
6  * A template to generate an "adaptive radix tree", specialized for value
7  * types and for local/shared memory.
8  *
9  * The concept originates from the paper "The Adaptive Radix Tree: ARTful
10  * Indexing for Main-Memory Databases" by Viktor Leis, Alfons Kemper,
11  * and Thomas Neumann, 2013.
12  *
13  * Radix trees have some advantages over hash tables:
14  * - The keys are logically ordered, allowing efficient sorted iteration
15  * and range queries
16  * - Operations using keys that are lexicographically close together
17  * will have favorable memory locality
18  * - Memory use grows gradually rather than by doubling
19  * - The key does not need to be stored with the value, since the key
20  * is implicitly contained in the path to the value
21  *
22  * Some disadvantages are:
23  * - Point queries (along with insertion and deletion) are slower than
24  * a linear probing hash table as in simplehash.h
25  * - Memory usage varies by key distribution, so is difficult to predict
26  *
27  * A classic radix tree consists of nodes, each containing an array of
28  * pointers to child nodes. The size of the array is determined by the
29  * "span" of the tree, which is the number of bits of the key used to
30  * index into the array. For example, with a span of 6, a "chunk"
31  * of 6 bits is extracted from the key at each node traversal, and
32  * the arrays thus have a "fanout" of 2^6 or 64 entries. A large span
33  * allows a shorter tree, but requires larger arrays that may be mostly
34  * wasted space.
35  *
36  * The key idea of the adaptive radix tree is to choose different
37  * data structures based on the number of child nodes. A node will
38  * start out small when it is first populated, and when it is full,
39  * it is replaced by the next larger size. Conversely, when a node
40  * becomes mostly empty, it is replaced by the next smaller node. The
41  * bulk of the code complexity in this module stems from this dynamic
42  * switching. One mitigating factor is using a span of 8, since bytes
43  * are directly addressable.
44  *
45  * The ART paper mentions three ways to implement leaves:
46  *
47  * "- Single-value leaves: The values are stored using an addi-
48  * tional leaf node type which stores one value.
49  * - Multi-value leaves: The values are stored in one of four
50  * different leaf node types, which mirror the structure of
51  * inner nodes, but contain values instead of pointers.
52  * - Combined pointer/value slots: If values fit into point-
53  * ers, no separate node types are necessary. Instead, each
54  * pointer storage location in an inner node can either
55  * store a pointer or a value."
56  *
57  * We use a form of "combined pointer/value slots", as recommended. Values
58  * of size (if fixed at compile time) equal or smaller than the platform's
59  * pointer type are stored in the child slots of the last level node,
60  * while larger values are the same as "single-value" leaves above. This
61  * offers flexibility and efficiency. Variable-length types are currently
62  * treated as single-value leaves for simplicity, but future work may
63  * allow those to be stored in the child pointer arrays, when they're
64  * small enough.
65  *
66  * There are two other techniques described in the paper that are not
67  * implemented here:
68  * - path compression "...removes all inner nodes that have only a single child."
69  * - lazy path expansion "...inner nodes are only created if they are required
70  * to distinguish at least two leaf nodes."
71  *
72  * We do have a form of "poor man's path compression", however, enabled by
73  * only supporting unsigned integer keys (for now assumed to be 64-bit):
74  * A tree doesn't contain paths where the highest bytes of all keys are
75  * zero. That way, the tree's height adapts to the distribution of keys.
76  *
77  * To handle concurrency, we use a single reader-writer lock for the
78  * radix tree. If concurrent write operations are possible, the tree
79  * must be exclusively locked during write operations such as RT_SET()
80  * and RT_DELETE(), and share locked during read operations such as
81  * RT_FIND() and RT_BEGIN_ITERATE().
82  *
83  * TODO: The current locking mechanism is not optimized for high
84  * concurrency with mixed read-write workloads. In the future it might
85  * be worthwhile to replace it with the Optimistic Lock Coupling or
86  * ROWEX mentioned in the paper "The ART of Practical Synchronization"
87  * by the same authors as the ART paper, 2016.
88  *
89  * To generate a radix tree and associated functions for a use case
90  * several macros have to be #define'ed before this file is included.
91  * Including the file #undef's all those, so a new radix tree can be
92  * generated afterwards.
93  *
94  * The relevant parameters are:
95  * - RT_PREFIX - prefix for all symbol names generated. A prefix of "foo"
96  * will result in radix tree type "foo_radix_tree" and functions like
97  * "foo_create"/"foo_free" and so forth.
98  * - RT_DECLARE - if defined function prototypes and type declarations are
99  * generated
100  * - RT_DEFINE - if defined function definitions are generated
101  * - RT_SCOPE - in which scope (e.g. extern, static inline) do function
102  * declarations reside
103  * - RT_VALUE_TYPE - the type of the value.
104  * - RT_VARLEN_VALUE_SIZE() - for variable length values, an expression
105  * involving a pointer to the value type, to calculate size.
106  * NOTE: implies that the value is in fact variable-length,
107  * so do not set for fixed-length values.
108  * - RT_RUNTIME_EMBEDDABLE_VALUE - for variable length values, allows
109  * storing the value in a child pointer slot, rather than as a single-
110  * value leaf, if small enough. This requires that the value, when
111  * read as a child pointer, can be tagged in the lowest bit.
112  *
113  * Optional parameters:
114  * - RT_SHMEM - if defined, the radix tree is created in the DSA area
115  * so that multiple processes can access it simultaneously.
116  * - RT_DEBUG - if defined add stats tracking and debugging functions
117  *
118  * Interface
119  * ---------
120  *
121  * RT_CREATE - Create a new, empty radix tree
122  * RT_FREE - Free the radix tree
123  * RT_FIND - Lookup the value for a given key
124  * RT_SET - Set a key-value pair
125  * RT_BEGIN_ITERATE - Begin iterating through all key-value pairs
126  * RT_ITERATE_NEXT - Return next key-value pair, if any
127  * RT_END_ITERATE - End iteration
128  * RT_MEMORY_USAGE - Get the memory as measured by space in memory context blocks
129  *
130  * Interface for Shared Memory
131  * ---------
132  *
133  * RT_ATTACH - Attach to the radix tree
134  * RT_DETACH - Detach from the radix tree
135  * RT_LOCK_EXCLUSIVE - Lock the radix tree in exclusive mode
136  * RT_LOCK_SHARE - Lock the radix tree in share mode
137  * RT_UNLOCK - Unlock the radix tree
138  * RT_GET_HANDLE - Return the handle of the radix tree
139  *
140  * Optional Interface
141  * ---------
142  *
143  * RT_DELETE - Delete a key-value pair. Declared/defined if RT_USE_DELETE is defined
144  *
145  *
146  * Copyright (c) 2024, PostgreSQL Global Development Group
147  *
148  * IDENTIFICATION
149  * src/include/lib/radixtree.h
150  *
151  *-------------------------------------------------------------------------
152  */
153 
154 #include "nodes/bitmapset.h"
155 #include "port/pg_bitutils.h"
156 #include "port/simd.h"
157 #include "utils/dsa.h"
158 #include "utils/memutils.h"
159 #ifdef RT_SHMEM
160 #include "miscadmin.h"
161 #include "storage/lwlock.h"
162 #endif
163 
164 /* helpers */
165 #define RT_MAKE_PREFIX(a) CppConcat(a,_)
166 #define RT_MAKE_NAME(name) RT_MAKE_NAME_(RT_MAKE_PREFIX(RT_PREFIX),name)
167 #define RT_MAKE_NAME_(a,b) CppConcat(a,b)
168 /*
169  * stringify a macro constant, from https://gcc.gnu.org/onlinedocs/cpp/Stringizing.html
170  */
171 #define RT_STR(s) RT_STR_(s)
172 #define RT_STR_(s) #s
173 
174 /* function declarations */
175 #define RT_CREATE RT_MAKE_NAME(create)
176 #define RT_FREE RT_MAKE_NAME(free)
177 #define RT_FIND RT_MAKE_NAME(find)
178 #ifdef RT_SHMEM
179 #define RT_ATTACH RT_MAKE_NAME(attach)
180 #define RT_DETACH RT_MAKE_NAME(detach)
181 #define RT_GET_HANDLE RT_MAKE_NAME(get_handle)
182 #define RT_LOCK_EXCLUSIVE RT_MAKE_NAME(lock_exclusive)
183 #define RT_LOCK_SHARE RT_MAKE_NAME(lock_share)
184 #define RT_UNLOCK RT_MAKE_NAME(unlock)
185 #endif
186 #define RT_SET RT_MAKE_NAME(set)
187 #define RT_BEGIN_ITERATE RT_MAKE_NAME(begin_iterate)
188 #define RT_ITERATE_NEXT RT_MAKE_NAME(iterate_next)
189 #define RT_END_ITERATE RT_MAKE_NAME(end_iterate)
190 #ifdef RT_USE_DELETE
191 #define RT_DELETE RT_MAKE_NAME(delete)
192 #endif
193 #define RT_MEMORY_USAGE RT_MAKE_NAME(memory_usage)
194 #define RT_DUMP_NODE RT_MAKE_NAME(dump_node)
195 #define RT_STATS RT_MAKE_NAME(stats)
196 
197 /* internal helper functions (no externally visible prototypes) */
198 #define RT_CHILDPTR_IS_VALUE RT_MAKE_NAME(childptr_is_value)
199 #define RT_VALUE_IS_EMBEDDABLE RT_MAKE_NAME(value_is_embeddable)
200 #define RT_GET_SLOT_RECURSIVE RT_MAKE_NAME(get_slot_recursive)
201 #define RT_DELETE_RECURSIVE RT_MAKE_NAME(delete_recursive)
202 #define RT_ALLOC_NODE RT_MAKE_NAME(alloc_node)
203 #define RT_ALLOC_LEAF RT_MAKE_NAME(alloc_leaf)
204 #define RT_FREE_NODE RT_MAKE_NAME(free_node)
205 #define RT_FREE_LEAF RT_MAKE_NAME(free_leaf)
206 #define RT_FREE_RECURSE RT_MAKE_NAME(free_recurse)
207 #define RT_EXTEND_UP RT_MAKE_NAME(extend_up)
208 #define RT_EXTEND_DOWN RT_MAKE_NAME(extend_down)
209 #define RT_COPY_COMMON RT_MAKE_NAME(copy_common)
210 #define RT_PTR_SET_LOCAL RT_MAKE_NAME(ptr_set_local)
211 #define RT_NODE_16_SEARCH_EQ RT_MAKE_NAME(node_16_search_eq)
212 #define RT_NODE_4_GET_INSERTPOS RT_MAKE_NAME(node_4_get_insertpos)
213 #define RT_NODE_16_GET_INSERTPOS RT_MAKE_NAME(node_16_get_insertpos)
214 #define RT_SHIFT_ARRAYS_FOR_INSERT RT_MAKE_NAME(shift_arrays_for_insert)
215 #define RT_SHIFT_ARRAYS_AND_DELETE RT_MAKE_NAME(shift_arrays_and_delete)
216 #define RT_COPY_ARRAYS_FOR_INSERT RT_MAKE_NAME(copy_arrays_for_insert)
217 #define RT_COPY_ARRAYS_AND_DELETE RT_MAKE_NAME(copy_arrays_and_delete)
218 #define RT_NODE_48_IS_CHUNK_USED RT_MAKE_NAME(node_48_is_chunk_used)
219 #define RT_NODE_48_GET_CHILD RT_MAKE_NAME(node_48_get_child)
220 #define RT_NODE_256_IS_CHUNK_USED RT_MAKE_NAME(node_256_is_chunk_used)
221 #define RT_NODE_256_GET_CHILD RT_MAKE_NAME(node_256_get_child)
222 #define RT_KEY_GET_SHIFT RT_MAKE_NAME(key_get_shift)
223 #define RT_SHIFT_GET_MAX_VAL RT_MAKE_NAME(shift_get_max_val)
224 #define RT_NODE_SEARCH RT_MAKE_NAME(node_search)
225 #define RT_NODE_DELETE RT_MAKE_NAME(node_delete)
226 #define RT_NODE_INSERT RT_MAKE_NAME(node_insert)
227 #define RT_ADD_CHILD_4 RT_MAKE_NAME(add_child_4)
228 #define RT_ADD_CHILD_16 RT_MAKE_NAME(add_child_16)
229 #define RT_ADD_CHILD_48 RT_MAKE_NAME(add_child_48)
230 #define RT_ADD_CHILD_256 RT_MAKE_NAME(add_child_256)
231 #define RT_GROW_NODE_4 RT_MAKE_NAME(grow_node_4)
232 #define RT_GROW_NODE_16 RT_MAKE_NAME(grow_node_16)
233 #define RT_GROW_NODE_48 RT_MAKE_NAME(grow_node_48)
234 #define RT_REMOVE_CHILD_4 RT_MAKE_NAME(remove_child_4)
235 #define RT_REMOVE_CHILD_16 RT_MAKE_NAME(remove_child_16)
236 #define RT_REMOVE_CHILD_48 RT_MAKE_NAME(remove_child_48)
237 #define RT_REMOVE_CHILD_256 RT_MAKE_NAME(remove_child_256)
238 #define RT_SHRINK_NODE_16 RT_MAKE_NAME(shrink_child_16)
239 #define RT_SHRINK_NODE_48 RT_MAKE_NAME(shrink_child_48)
240 #define RT_SHRINK_NODE_256 RT_MAKE_NAME(shrink_child_256)
241 #define RT_NODE_ITERATE_NEXT RT_MAKE_NAME(node_iterate_next)
242 #define RT_VERIFY_NODE RT_MAKE_NAME(verify_node)
243 
244 /* type declarations */
245 #define RT_RADIX_TREE RT_MAKE_NAME(radix_tree)
246 #define RT_RADIX_TREE_CONTROL RT_MAKE_NAME(radix_tree_control)
247 #define RT_ITER RT_MAKE_NAME(iter)
248 #ifdef RT_SHMEM
249 #define RT_HANDLE RT_MAKE_NAME(handle)
250 #endif
251 #define RT_NODE RT_MAKE_NAME(node)
252 #define RT_CHILD_PTR RT_MAKE_NAME(child_ptr)
253 #define RT_NODE_ITER RT_MAKE_NAME(node_iter)
254 #define RT_NODE_4 RT_MAKE_NAME(node_4)
255 #define RT_NODE_16 RT_MAKE_NAME(node_16)
256 #define RT_NODE_48 RT_MAKE_NAME(node_48)
257 #define RT_NODE_256 RT_MAKE_NAME(node_256)
258 #define RT_SIZE_CLASS RT_MAKE_NAME(size_class)
259 #define RT_SIZE_CLASS_ELEM RT_MAKE_NAME(size_class_elem)
260 #define RT_SIZE_CLASS_INFO RT_MAKE_NAME(size_class_info)
261 #define RT_CLASS_4 RT_MAKE_NAME(class_4)
262 #define RT_CLASS_16_LO RT_MAKE_NAME(class_32_min)
263 #define RT_CLASS_16_HI RT_MAKE_NAME(class_32_max)
264 #define RT_CLASS_48 RT_MAKE_NAME(class_48)
265 #define RT_CLASS_256 RT_MAKE_NAME(class_256)
266 
267 /* generate forward declarations necessary to use the radix tree */
268 #ifdef RT_DECLARE
269 
270 typedef struct RT_RADIX_TREE RT_RADIX_TREE;
271 typedef struct RT_ITER RT_ITER;
272 
273 #ifdef RT_SHMEM
274 typedef dsa_pointer RT_HANDLE;
275 #endif
276 
277 #ifdef RT_SHMEM
278 RT_SCOPE RT_RADIX_TREE *RT_CREATE(MemoryContext ctx, dsa_area *dsa, int tranche_id);
279 RT_SCOPE RT_RADIX_TREE *RT_ATTACH(dsa_area *dsa, dsa_pointer dp);
280 RT_SCOPE void RT_DETACH(RT_RADIX_TREE * tree);
281 RT_SCOPE RT_HANDLE RT_GET_HANDLE(RT_RADIX_TREE * tree);
282 RT_SCOPE void RT_LOCK_EXCLUSIVE(RT_RADIX_TREE * tree);
283 RT_SCOPE void RT_LOCK_SHARE(RT_RADIX_TREE * tree);
284 RT_SCOPE void RT_UNLOCK(RT_RADIX_TREE * tree);
285 #else
287 #endif
289 
292 
293 #ifdef RT_USE_DELETE
294 RT_SCOPE bool RT_DELETE(RT_RADIX_TREE * tree, uint64 key);
295 #endif
296 
299 RT_SCOPE void RT_END_ITERATE(RT_ITER * iter);
300 
302 
303 #ifdef RT_DEBUG
305 #endif
306 
307 #endif /* RT_DECLARE */
308 
309 
310 /* generate implementation of the radix tree */
311 #ifdef RT_DEFINE
312 
313 /* The number of bits encoded in one tree level */
314 #define RT_SPAN BITS_PER_BYTE
315 
316 /*
317  * The number of possible partial keys, and thus the maximum number of
318  * child pointers, for a node.
319  */
320 #define RT_NODE_MAX_SLOTS (1 << RT_SPAN)
321 
322 /* Mask for extracting a chunk from a key */
323 #define RT_CHUNK_MASK ((1 << RT_SPAN) - 1)
324 
325 /* Maximum shift needed to extract a chunk from a key */
326 #define RT_MAX_SHIFT RT_KEY_GET_SHIFT(UINT64_MAX)
327 
328 /* Maximum level a tree can reach for a key */
329 #define RT_MAX_LEVEL ((sizeof(uint64) * BITS_PER_BYTE) / RT_SPAN)
330 
331 /* Get a chunk from the key */
332 #define RT_GET_KEY_CHUNK(key, shift) ((uint8) (((key) >> (shift)) & RT_CHUNK_MASK))
333 
334 /* For accessing bitmaps */
335 #define RT_BM_IDX(x) ((x) / BITS_PER_BITMAPWORD)
336 #define RT_BM_BIT(x) ((x) % BITS_PER_BITMAPWORD)
337 
338 /*
339  * Node kinds
340  *
341  * The different node kinds are what make the tree "adaptive".
342  *
343  * Each node kind is associated with a different datatype and different
344  * search/set/delete/iterate algorithms adapted for its size. The largest
345  * kind, node256 is basically the same as a traditional radix tree,
346  * and would be most wasteful of memory when sparsely populated. The
347  * smaller nodes expend some additional CPU time to enable a smaller
348  * memory footprint.
349  *
350  * NOTE: There are 4 node kinds, and this should never be increased,
351  * for several reasons:
352  * 1. With 5 or more kinds, gcc tends to use a jump table for switch
353  * statements.
354  * 2. The 4 kinds can be represented with 2 bits, so we have the option
355  * in the future to tag the node pointer with the kind, even on
356  * platforms with 32-bit pointers. That would touch fewer cache lines
357  * during traversal and allow faster recovery from branch mispredicts.
358  * 3. We can have multiple size classes per node kind.
359  */
360 #define RT_NODE_KIND_4 0x00
361 #define RT_NODE_KIND_16 0x01
362 #define RT_NODE_KIND_48 0x02
363 #define RT_NODE_KIND_256 0x03
364 #define RT_NODE_KIND_COUNT 4
365 
366 /*
367  * Calculate the slab block size so that we can allocate at least 32 chunks
368  * from the block.
369  */
370 #define RT_SLAB_BLOCK_SIZE(size) \
371  Max(SLAB_DEFAULT_BLOCK_SIZE, pg_nextpower2_32(size * 32))
372 
373 /* Common header for all nodes */
374 typedef struct RT_NODE
375 {
376  /* Node kind, one per search/set algorithm */
378 
379  /*
380  * Max capacity for the current size class. Storing this in the node
381  * enables multiple size classes per node kind. uint8 is sufficient for
382  * all node kinds, because we only use this number to test if the node
383  * needs to grow. Since node256 never needs to grow, we let this overflow
384  * to zero.
385  */
387 
388  /*
389  * Number of children. uint8 is sufficient for all node kinds, because
390  * nodes shrink when this number gets lower than some threshold. Since
391  * node256 cannot possibly have zero children, we let the counter overflow
392  * and we interpret zero as "256" for this node kind.
393  */
396 
397 
398 /* pointer returned by allocation */
399 #ifdef RT_SHMEM
400 #define RT_PTR_ALLOC dsa_pointer
401 #define RT_INVALID_PTR_ALLOC InvalidDsaPointer
402 #define RT_PTR_ALLOC_IS_VALID(ptr) DsaPointerIsValid(ptr)
403 #else
404 #define RT_PTR_ALLOC RT_NODE *
405 #define RT_INVALID_PTR_ALLOC NULL
406 #define RT_PTR_ALLOC_IS_VALID(ptr) PointerIsValid(ptr)
407 #endif
408 
409 /*
410  * A convenience type used when we need to work with a DSA pointer as well
411  * as its local pointer. For local memory, both members are the same, so
412  * we use a union.
413  */
414 #ifdef RT_SHMEM
415 typedef struct RT_CHILD_PTR
416 #else
417 typedef union RT_CHILD_PTR
418 #endif
419 {
423 
424 
425 /*
426  * Helper macros and functions for value storage.
427  * We either embed values in the child slots of the last level
428  * node or store pointers to values to the child slots,
429  * depending on the value size.
430  */
431 
432 #ifdef RT_VARLEN_VALUE_SIZE
433 #define RT_GET_VALUE_SIZE(v) RT_VARLEN_VALUE_SIZE(v)
434 #else
435 #define RT_GET_VALUE_SIZE(v) sizeof(RT_VALUE_TYPE)
436 #endif
437 
438 /*
439  * Return true if the value can be stored in the child array
440  * of the lowest-level node, false otherwise.
441  */
442 static inline bool
444 {
445 #ifdef RT_VARLEN_VALUE_SIZE
446 
447 #ifdef RT_RUNTIME_EMBEDDABLE_VALUE
448  return RT_GET_VALUE_SIZE(value_p) <= sizeof(RT_PTR_ALLOC);
449 #else
450  return false;
451 #endif
452 
453 #else
454  return RT_GET_VALUE_SIZE(value_p) <= sizeof(RT_PTR_ALLOC);
455 #endif
456 }
457 
458 /*
459  * Return true if the child pointer contains the value, false
460  * if the child pointer is a leaf pointer.
461  */
462 static inline bool
464 {
465 #ifdef RT_VARLEN_VALUE_SIZE
466 
467 #ifdef RT_RUNTIME_EMBEDDABLE_VALUE
468  /* check for pointer tag */
469 #ifdef RT_SHMEM
470  return child & 1;
471 #else
472  return ((uintptr_t) child) & 1;
473 #endif
474 
475 #else
476  return false;
477 #endif
478 
479 #else
480  return sizeof(RT_VALUE_TYPE) <= sizeof(RT_PTR_ALLOC);
481 #endif
482 }
483 
484 /*
485  * Symbols for maximum possible fanout are declared first as they are
486  * required to declare each node kind. The declarations of other fanout
487  * values are followed as they need the struct sizes of each node kind.
488  */
489 
490 /* max possible key chunks without struct padding */
491 #define RT_FANOUT_4_MAX (8 - sizeof(RT_NODE))
492 
493 /* equal to two 128-bit SIMD registers, regardless of availability */
494 #define RT_FANOUT_16_MAX 32
495 
496 /*
497  * This also determines the number of bits necessary for the isset array,
498  * so we need to be mindful of the size of bitmapword. Since bitmapword
499  * can be 64 bits, the only values that make sense here are 64 and 128.
500  * The ART paper uses at most 64 for this node kind, and one advantage
501  * for us is that "isset" is a single bitmapword on most platforms,
502  * rather than an array, allowing the compiler to get rid of loops.
503  */
504 #define RT_FANOUT_48_MAX 64
505 
506 #define RT_FANOUT_256 RT_NODE_MAX_SLOTS
507 
508 /*
509  * Node structs, one for each "kind"
510  */
511 
512 /*
513  * node4 and node16 use one array for key chunks and another
514  * array of the same length for children. The keys and children
515  * are stored at corresponding positions, sorted by chunk.
516  */
517 
518 typedef struct RT_NODE_4
519 {
521 
523 
524  /* number of children depends on size class */
527 
528 typedef struct RT_NODE_16
529 {
531 
533 
534  /* number of children depends on size class */
537 
538 /*
539  * node48 uses a 256-element array indexed by key chunks. This array
540  * stores indexes into a second array containing the children.
541  */
542 typedef struct RT_NODE_48
543 {
545 
546  /* bitmap to track which slots are in use */
548 
549  /*
550  * Lookup table for indexes into the children[] array. We make this the
551  * last fixed-size member so that it's convenient to memset separately
552  * from the previous members.
553  */
555 
556 /* Invalid index */
557 #define RT_INVALID_SLOT_IDX 0xFF
558 
559  /* number of children depends on size class */
562 
563 /*
564  * node256 is the largest node type. This node has an array of
565  * children directly indexed by chunk. Unlike other node kinds,
566  * its array size is by definition fixed.
567  */
568 typedef struct RT_NODE_256
569 {
571 
572  /* bitmap to track which slots are in use */
574 
575  /* slots for 256 children */
578 
579 #if defined(RT_SHMEM)
580 /*
581  * Make sure the all nodes (except for node256) fit neatly into a DSA
582  * size class. We assume the RT_FANOUT_4 is in the range where DSA size
583  * classes increment by 8 (as of PG17 up to 64 bytes), so we just hard
584  * code that one.
585  */
586 
587 #if SIZEOF_DSA_POINTER < 8
588 #define RT_FANOUT_16_LO ((96 - offsetof(RT_NODE_16, children)) / sizeof(RT_PTR_ALLOC))
589 #define RT_FANOUT_16_HI Min(RT_FANOUT_16_MAX, (160 - offsetof(RT_NODE_16, children)) / sizeof(RT_PTR_ALLOC))
590 #define RT_FANOUT_48 Min(RT_FANOUT_48_MAX, (512 - offsetof(RT_NODE_48, children)) / sizeof(RT_PTR_ALLOC))
591 #else
592 #define RT_FANOUT_16_LO ((160 - offsetof(RT_NODE_16, children)) / sizeof(RT_PTR_ALLOC))
593 #define RT_FANOUT_16_HI Min(RT_FANOUT_16_MAX, (320 - offsetof(RT_NODE_16, children)) / sizeof(RT_PTR_ALLOC))
594 #define RT_FANOUT_48 Min(RT_FANOUT_48_MAX, (768 - offsetof(RT_NODE_48, children)) / sizeof(RT_PTR_ALLOC))
595 #endif /* SIZEOF_DSA_POINTER < 8 */
596 
597 #else /* ! RT_SHMEM */
598 
599 /* doesn't really matter, but may as well use the namesake */
600 #define RT_FANOUT_16_LO 16
601 /* use maximum possible */
602 #define RT_FANOUT_16_HI RT_FANOUT_16_MAX
603 #define RT_FANOUT_48 RT_FANOUT_48_MAX
604 
605 #endif /* RT_SHMEM */
606 
607 /*
608  * To save memory in trees with sparse keys, it would make sense to have two
609  * size classes for the smallest kind (perhaps a high class of 5 and a low class
610  * of 2), but it would be more effective to utilize lazy expansion and
611  * path compression.
612  */
613 #define RT_FANOUT_4 4
614 
615 StaticAssertDecl(RT_FANOUT_4 <= RT_FANOUT_4_MAX, "watch struct padding");
616 StaticAssertDecl(RT_FANOUT_16_LO < RT_FANOUT_16_HI, "LO subclass bigger than HI");
617 StaticAssertDecl(RT_FANOUT_48 <= RT_FANOUT_48_MAX, "more slots than isset bits");
618 
619 /*
620  * Node size classes
621  *
622  * Nodes of different kinds necessarily belong to different size classes.
623  * One innovation in our implementation compared to the ART paper is
624  * decoupling the notion of size class from kind.
625  *
626  * The size classes within a given node kind have the same underlying
627  * type, but a variable number of children/values. This is possible
628  * because each type (except node256) contains metadata that work the
629  * same way regardless of how many child slots there are. The nodes
630  * can introspect their allocated capacity at runtime.
631  */
632 typedef enum RT_SIZE_CLASS
633 {
640 
641 /* Information for each size class */
642 typedef struct RT_SIZE_CLASS_ELEM
643 {
644  const char *name;
645  int fanout;
646  size_t allocsize;
648 
649 
651  [RT_CLASS_4] = {
652  .name = RT_STR(RT_PREFIX) "_radix_tree node4",
653  .fanout = RT_FANOUT_4,
654  .allocsize = sizeof(RT_NODE_4) + RT_FANOUT_4 * sizeof(RT_PTR_ALLOC),
655  },
656  [RT_CLASS_16_LO] = {
657  .name = RT_STR(RT_PREFIX) "_radix_tree node16_lo",
658  .fanout = RT_FANOUT_16_LO,
659  .allocsize = sizeof(RT_NODE_16) + RT_FANOUT_16_LO * sizeof(RT_PTR_ALLOC),
660  },
661  [RT_CLASS_16_HI] = {
662  .name = RT_STR(RT_PREFIX) "_radix_tree node16_hi",
663  .fanout = RT_FANOUT_16_HI,
664  .allocsize = sizeof(RT_NODE_16) + RT_FANOUT_16_HI * sizeof(RT_PTR_ALLOC),
665  },
666  [RT_CLASS_48] = {
667  .name = RT_STR(RT_PREFIX) "_radix_tree node48",
668  .fanout = RT_FANOUT_48,
669  .allocsize = sizeof(RT_NODE_48) + RT_FANOUT_48 * sizeof(RT_PTR_ALLOC),
670  },
671  [RT_CLASS_256] = {
672  .name = RT_STR(RT_PREFIX) "_radix_tree node256",
673  .fanout = RT_FANOUT_256,
674  .allocsize = sizeof(RT_NODE_256),
675  },
676 };
677 
678 #define RT_NUM_SIZE_CLASSES lengthof(RT_SIZE_CLASS_INFO)
679 
680 #ifdef RT_SHMEM
681 /* A magic value used to identify our radix tree */
682 #define RT_RADIX_TREE_MAGIC 0x54A48167
683 #endif
684 
685 /* Contains the actual tree, plus ancillary info */
686 typedef struct RT_RADIX_TREE_CONTROL
687 {
688 #ifdef RT_SHMEM
689  RT_HANDLE handle;
690  uint32 magic;
691  LWLock lock;
692 #endif
693 
698 
699  /* statistics */
700 #ifdef RT_DEBUG
701  int64 num_nodes[RT_NUM_SIZE_CLASSES];
702  int64 num_leaves;
703 #endif
705 
706 /* Entry point for allocating and accessing the tree */
708 {
710 
711  /* pointing to either local memory or DSA */
713 
714 #ifdef RT_SHMEM
715  dsa_area *dsa;
716 #else
718 
719  /* leaf_context is used only for single-value leaves */
721 #endif
723 };
724 
725 /*
726  * Iteration support.
727  *
728  * Iterating over the radix tree produces each key/value pair in ascending
729  * order of the key.
730  */
731 
732 /* state for iterating over a single node */
733 typedef struct RT_NODE_ITER
734 {
736 
737  /*
738  * The next index of the chunk array in RT_NODE_KIND_4 and RT_NODE_KIND_16
739  * nodes, or the next chunk in RT_NODE_KIND_48 and RT_NODE_KIND_256 nodes.
740  * 0 for the initial value.
741  */
742  int idx;
744 
745 /* state for iterating over the whole radix tree */
746 struct RT_ITER
747 {
749 
750  /*
751  * A stack to track iteration for each level. Level 0 is the lowest (or
752  * leaf) level
753  */
757 
758  /* The key constructed during iteration */
760 };
761 
762 
763 /* verification (available only in assert-enabled builds) */
764 static void RT_VERIFY_NODE(RT_NODE * node);
765 
766 static inline void
768 {
769 #ifdef RT_SHMEM
770  node->local = dsa_get_address(tree->dsa, node->alloc);
771 #endif
772 }
773 
774 /* Convenience functions for node48 and node256 */
775 
776 /* Return true if there is an entry for "chunk" */
777 static inline bool
779 {
780  return node->slot_idxs[chunk] != RT_INVALID_SLOT_IDX;
781 }
782 
783 static inline RT_PTR_ALLOC *
785 {
786  return &node->children[node->slot_idxs[chunk]];
787 }
788 
789 /* Return true if there is an entry for "chunk" */
790 static inline bool
792 {
793  int idx = RT_BM_IDX(chunk);
794  int bitnum = RT_BM_BIT(chunk);
795 
796  return (node->isset[idx] & ((bitmapword) 1 << bitnum)) != 0;
797 }
798 
799 static inline RT_PTR_ALLOC *
801 {
803  return &node->children[chunk];
804 }
805 
806 /*
807  * Return the smallest shift that will allowing storing the given key.
808  */
809 static inline int
811 {
812  if (key == 0)
813  return 0;
814  else
816 }
817 
818 /*
819  * Return the max value that can be stored in the tree with the given shift.
820  */
821 static uint64
823 {
824  if (shift == RT_MAX_SHIFT)
825  return UINT64_MAX;
826  else
827  return (UINT64CONST(1) << (shift + RT_SPAN)) - 1;
828 }
829 
830 /*
831  * Allocate a new node with the given node kind and size class.
832  */
833 static inline RT_CHILD_PTR
834 RT_ALLOC_NODE(RT_RADIX_TREE * tree, const uint8 kind, const RT_SIZE_CLASS size_class)
835 {
836  RT_CHILD_PTR allocnode;
837  RT_NODE *node;
838  size_t allocsize;
839 
840  allocsize = RT_SIZE_CLASS_INFO[size_class].allocsize;
841 
842 #ifdef RT_SHMEM
843  allocnode.alloc = dsa_allocate(tree->dsa, allocsize);
844 #else
845  allocnode.alloc = (RT_PTR_ALLOC) MemoryContextAlloc(tree->node_slabs[size_class],
846  allocsize);
847 #endif
848 
849  RT_PTR_SET_LOCAL(tree, &allocnode);
850  node = allocnode.local;
851 
852  /* initialize contents */
853 
854  switch (kind)
855  {
856  case RT_NODE_KIND_4:
857  memset(node, 0, offsetof(RT_NODE_4, children));
858  break;
859  case RT_NODE_KIND_16:
860  memset(node, 0, offsetof(RT_NODE_16, children));
861  break;
862  case RT_NODE_KIND_48:
863  {
864  RT_NODE_48 *n48 = (RT_NODE_48 *) node;
865 
866  memset(n48, 0, offsetof(RT_NODE_48, slot_idxs));
867  memset(n48->slot_idxs, RT_INVALID_SLOT_IDX, sizeof(n48->slot_idxs));
868  break;
869  }
870  case RT_NODE_KIND_256:
871  memset(node, 0, offsetof(RT_NODE_256, children));
872  break;
873  default:
874  pg_unreachable();
875  }
876 
877  node->kind = kind;
878 
879  /*
880  * For node256, this will actually overflow to zero, but that's okay
881  * because that node doesn't need to introspect this value.
882  */
883  node->fanout = RT_SIZE_CLASS_INFO[size_class].fanout;
884 
885 #ifdef RT_DEBUG
886  /* update the statistics */
887  tree->ctl->num_nodes[size_class]++;
888 #endif
889 
890  return allocnode;
891 }
892 
893 /*
894  * Allocate a new leaf.
895  */
896 static RT_CHILD_PTR
897 RT_ALLOC_LEAF(RT_RADIX_TREE * tree, size_t allocsize)
898 {
899  RT_CHILD_PTR leaf;
900 
901 #ifdef RT_SHMEM
902  leaf.alloc = dsa_allocate(tree->dsa, allocsize);
903  RT_PTR_SET_LOCAL(tree, &leaf);
904 #else
905  leaf.alloc = (RT_PTR_ALLOC) MemoryContextAlloc(tree->leaf_context, allocsize);
906 #endif
907 
908 #ifdef RT_DEBUG
909  tree->ctl->num_leaves++;
910 #endif
911 
912  return leaf;
913 }
914 
915 /*
916  * Copy relevant members of the node header.
917  * This is a separate function in case other fields are added.
918  */
919 static inline void
921 {
922  (newnode.local)->count = (oldnode.local)->count;
923 }
924 
925 /* Free the given node */
926 static void
928 {
929 #ifdef RT_DEBUG
930  int i;
931 
932  /* update the statistics */
933 
934  for (i = 0; i < RT_NUM_SIZE_CLASSES; i++)
935  {
936  if ((node.local)->fanout == RT_SIZE_CLASS_INFO[i].fanout)
937  break;
938  }
939 
940  /*
941  * The fanout of node256 will appear to be zero within the node header
942  * because of overflow, so we need an extra check here.
943  */
944  if (i == RT_NUM_SIZE_CLASSES)
945  i = RT_CLASS_256;
946 
947  tree->ctl->num_nodes[i]--;
948  Assert(tree->ctl->num_nodes[i] >= 0);
949 #endif
950 
951 #ifdef RT_SHMEM
952  dsa_free(tree->dsa, node.alloc);
953 #else
954  pfree(node.alloc);
955 #endif
956 }
957 
958 static inline void
960 {
961  Assert(leaf != tree->ctl->root);
962 
963 #ifdef RT_DEBUG
964  /* update the statistics */
965  tree->ctl->num_leaves--;
966  Assert(tree->ctl->num_leaves >= 0);
967 #endif
968 
969 #ifdef RT_SHMEM
970  dsa_free(tree->dsa, leaf);
971 #else
972  pfree(leaf);
973 #endif
974 }
975 
976 /***************** SEARCH *****************/
977 
978 /*
979  * Return the address of the child corresponding to "chunk",
980  * or NULL if there is no such element.
981  */
982 static inline RT_PTR_ALLOC *
984 {
985  int count = node->base.count;
986 #ifndef USE_NO_SIMD
987  Vector8 spread_chunk;
988  Vector8 haystack1;
989  Vector8 haystack2;
990  Vector8 cmp1;
991  Vector8 cmp2;
992  uint32 bitfield;
993  RT_PTR_ALLOC *slot_simd = NULL;
994 #endif
995 
996 #if defined(USE_NO_SIMD) || defined(USE_ASSERT_CHECKING)
997  RT_PTR_ALLOC *slot = NULL;
998 
999  for (int i = 0; i < count; i++)
1000  {
1001  if (node->chunks[i] == chunk)
1002  {
1003  slot = &node->children[i];
1004  break;
1005  }
1006  }
1007 #endif
1008 
1009 #ifndef USE_NO_SIMD
1010  /* replicate the search key */
1011  spread_chunk = vector8_broadcast(chunk);
1012 
1013  /* compare to all 32 keys stored in the node */
1014  vector8_load(&haystack1, &node->chunks[0]);
1015  vector8_load(&haystack2, &node->chunks[sizeof(Vector8)]);
1016  cmp1 = vector8_eq(spread_chunk, haystack1);
1017  cmp2 = vector8_eq(spread_chunk, haystack2);
1018 
1019  /* convert comparison to a bitfield */
1020  bitfield = vector8_highbit_mask(cmp1) | (vector8_highbit_mask(cmp2) << sizeof(Vector8));
1021 
1022  /* mask off invalid entries */
1023  bitfield &= ((UINT64CONST(1) << count) - 1);
1024 
1025  /* convert bitfield to index by counting trailing zeros */
1026  if (bitfield)
1027  slot_simd = &node->children[pg_rightmost_one_pos32(bitfield)];
1028 
1029  Assert(slot_simd == slot);
1030  return slot_simd;
1031 #else
1032  return slot;
1033 #endif
1034 }
1035 
1036 /*
1037  * Search for the child pointer corresponding to "key" in the given node.
1038  *
1039  * Return child if the key is found, otherwise return NULL.
1040  */
1041 static inline RT_PTR_ALLOC *
1043 {
1044  /* Make sure we already converted to local pointer */
1045  Assert(node != NULL);
1046 
1047  switch (node->kind)
1048  {
1049  case RT_NODE_KIND_4:
1050  {
1051  RT_NODE_4 *n4 = (RT_NODE_4 *) node;
1052 
1053  for (int i = 0; i < n4->base.count; i++)
1054  {
1055  if (n4->chunks[i] == chunk)
1056  return &n4->children[i];
1057  }
1058  return NULL;
1059  }
1060  case RT_NODE_KIND_16:
1061  return RT_NODE_16_SEARCH_EQ((RT_NODE_16 *) node, chunk);
1062  case RT_NODE_KIND_48:
1063  {
1064  RT_NODE_48 *n48 = (RT_NODE_48 *) node;
1065  int slotpos = n48->slot_idxs[chunk];
1066 
1067  if (slotpos == RT_INVALID_SLOT_IDX)
1068  return NULL;
1069 
1070  return RT_NODE_48_GET_CHILD(n48, chunk);
1071  }
1072  case RT_NODE_KIND_256:
1073  {
1074  RT_NODE_256 *n256 = (RT_NODE_256 *) node;
1075 
1076  if (!RT_NODE_256_IS_CHUNK_USED(n256, chunk))
1077  return NULL;
1078 
1079  return RT_NODE_256_GET_CHILD(n256, chunk);
1080  }
1081  default:
1082  pg_unreachable();
1083  }
1084 }
1085 
1086 /*
1087  * Search the given key in the radix tree. Return the pointer to the value if found,
1088  * otherwise return NULL.
1089  *
1090  * Since the function returns a pointer (to support variable-length values),
1091  * the caller is responsible for locking until it's finished with the value.
1092  */
1095 {
1096  RT_CHILD_PTR node;
1097  RT_PTR_ALLOC *slot = NULL;
1098  int shift;
1099 
1100 #ifdef RT_SHMEM
1101  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1102 #endif
1103 
1104  if (key > tree->ctl->max_val)
1105  return NULL;
1106 
1107  Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root));
1108  node.alloc = tree->ctl->root;
1109  shift = tree->ctl->start_shift;
1110 
1111  /* Descend the tree */
1112  while (shift >= 0)
1113  {
1114  RT_PTR_SET_LOCAL(tree, &node);
1115  slot = RT_NODE_SEARCH(node.local, RT_GET_KEY_CHUNK(key, shift));
1116  if (slot == NULL)
1117  return NULL;
1118 
1119  node.alloc = *slot;
1120  shift -= RT_SPAN;
1121  }
1122 
1123  if (RT_CHILDPTR_IS_VALUE(*slot))
1124  return (RT_VALUE_TYPE *) slot;
1125  else
1126  {
1127  RT_PTR_SET_LOCAL(tree, &node);
1128  return (RT_VALUE_TYPE *) node.local;
1129  }
1130 }
1131 
1132 /***************** INSERTION *****************/
1133 
1134 #define RT_NODE_MUST_GROW(node) \
1135  ((node)->count == (node)->fanout)
1136 
1137 /*
1138  * Return index of the chunk and slot arrays for inserting into the node,
1139  * such that the arrays remain ordered.
1140  */
1141 static inline int
1143 {
1144  int idx;
1145 
1146  for (idx = 0; idx < count; idx++)
1147  {
1148  if (node->chunks[idx] >= chunk)
1149  break;
1150  }
1151 
1152  return idx;
1153 }
1154 
1155 /*
1156  * Return index of the chunk and slot arrays for inserting into the node,
1157  * such that the arrays remain ordered.
1158  */
1159 static inline int
1161 {
1162  int count = node->base.count;
1163 #if defined(USE_NO_SIMD) || defined(USE_ASSERT_CHECKING)
1164  int index;
1165 #endif
1166 
1167 #ifndef USE_NO_SIMD
1168  Vector8 spread_chunk;
1169  Vector8 haystack1;
1170  Vector8 haystack2;
1171  Vector8 cmp1;
1172  Vector8 cmp2;
1173  Vector8 min1;
1174  Vector8 min2;
1175  uint32 bitfield;
1176  int index_simd;
1177 #endif
1178 
1179  /*
1180  * First compare the last element. There are two reasons to branch here:
1181  *
1182  * 1) A realistic pattern is inserting ordered keys. In that case,
1183  * non-SIMD platforms must do a linear search to the last chunk to find
1184  * the insert position. This will get slower as the node fills up.
1185  *
1186  * 2) On SIMD platforms, we must branch anyway to make sure we don't bit
1187  * scan an empty bitfield. Doing the branch here eliminates some work that
1188  * we might otherwise throw away.
1189  */
1190  Assert(count > 0);
1191  if (node->chunks[count - 1] < chunk)
1192  return count;
1193 
1194 #if defined(USE_NO_SIMD) || defined(USE_ASSERT_CHECKING)
1195 
1196  for (index = 0; index < count; index++)
1197  {
1198  if (node->chunks[index] > chunk)
1199  break;
1200  }
1201 #endif
1202 
1203 #ifndef USE_NO_SIMD
1204 
1205  /*
1206  * This is a bit more complicated than RT_NODE_16_SEARCH_EQ(), because no
1207  * unsigned uint8 comparison instruction exists, at least for SSE2. So we
1208  * need to play some trickery using vector8_min() to effectively get >=.
1209  * There'll never be any equal elements in current uses, but that's what
1210  * we get here...
1211  */
1212  spread_chunk = vector8_broadcast(chunk);
1213  vector8_load(&haystack1, &node->chunks[0]);
1214  vector8_load(&haystack2, &node->chunks[sizeof(Vector8)]);
1215  min1 = vector8_min(spread_chunk, haystack1);
1216  min2 = vector8_min(spread_chunk, haystack2);
1217  cmp1 = vector8_eq(spread_chunk, min1);
1218  cmp2 = vector8_eq(spread_chunk, min2);
1219  bitfield = vector8_highbit_mask(cmp1) | (vector8_highbit_mask(cmp2) << sizeof(Vector8));
1220 
1221  Assert((bitfield & ((UINT64CONST(1) << count) - 1)) != 0);
1222  index_simd = pg_rightmost_one_pos32(bitfield);
1223 
1224  Assert(index_simd == index);
1225  return index_simd;
1226 #else
1227  return index;
1228 #endif
1229 }
1230 
1231 /* Shift the elements right at "insertpos" by one */
1232 static inline void
1233 RT_SHIFT_ARRAYS_FOR_INSERT(uint8 *chunks, RT_PTR_ALLOC * children, int count, int insertpos)
1234 {
1235  /*
1236  * This is basically a memmove, but written in a simple loop for speed on
1237  * small inputs.
1238  */
1239  for (int i = count - 1; i >= insertpos; i--)
1240  {
1241  /* workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481 */
1242 #ifdef __GNUC__
1243  __asm__("");
1244 #endif
1245  chunks[i + 1] = chunks[i];
1246  children[i + 1] = children[i];
1247  }
1248 }
1249 
1250 /*
1251  * Copy both chunk and slot arrays into the right
1252  * place. The caller is responsible for inserting the new element.
1253  */
1254 static inline void
1255 RT_COPY_ARRAYS_FOR_INSERT(uint8 *dst_chunks, RT_PTR_ALLOC * dst_children,
1256  uint8 *src_chunks, RT_PTR_ALLOC * src_children,
1257  int count, int insertpos)
1258 {
1259  for (int i = 0; i < count; i++)
1260  {
1261  int sourceidx = i;
1262 
1263  /* use a branch-free computation to skip the index of the new element */
1264  int destidx = i + (i >= insertpos);
1265 
1266  dst_chunks[destidx] = src_chunks[sourceidx];
1267  dst_children[destidx] = src_children[sourceidx];
1268  }
1269 }
1270 
1271 static inline RT_PTR_ALLOC *
1273 {
1274  RT_NODE_256 *n256 = (RT_NODE_256 *) node.local;
1275  int idx = RT_BM_IDX(chunk);
1276  int bitnum = RT_BM_BIT(chunk);
1277 
1278  /* Mark the slot used for "chunk" */
1279  n256->isset[idx] |= ((bitmapword) 1 << bitnum);
1280 
1281  n256->base.count++;
1282  RT_VERIFY_NODE((RT_NODE *) n256);
1283 
1284  return RT_NODE_256_GET_CHILD(n256, chunk);
1285 }
1286 
1287 static pg_noinline RT_PTR_ALLOC *
1289  uint8 chunk)
1290 {
1291  RT_NODE_48 *n48 = (RT_NODE_48 *) node.local;
1292  RT_CHILD_PTR newnode;
1293  RT_NODE_256 *new256;
1294  int i = 0;
1295 
1296  /* initialize new node */
1298  new256 = (RT_NODE_256 *) newnode.local;
1299 
1300  /* copy over the entries */
1301  RT_COPY_COMMON(newnode, node);
1302  for (int word_num = 0; word_num < RT_BM_IDX(RT_NODE_MAX_SLOTS); word_num++)
1303  {
1304  bitmapword bitmap = 0;
1305 
1306  /*
1307  * Bit manipulation is a surprisingly large portion of the overhead in
1308  * the naive implementation. Doing stores word-at-a-time removes a lot
1309  * of that overhead.
1310  */
1311  for (int bit = 0; bit < BITS_PER_BITMAPWORD; bit++)
1312  {
1313  uint8 offset = n48->slot_idxs[i];
1314 
1315  if (offset != RT_INVALID_SLOT_IDX)
1316  {
1317  bitmap |= ((bitmapword) 1 << bit);
1318  new256->children[i] = n48->children[offset];
1319  }
1320 
1321  i++;
1322  }
1323 
1324  new256->isset[word_num] = bitmap;
1325  }
1326 
1327  /* free old node and update reference in parent */
1328  *parent_slot = newnode.alloc;
1329  RT_FREE_NODE(tree, node);
1330 
1331  return RT_ADD_CHILD_256(tree, newnode, chunk);
1332 }
1333 
1334 static inline RT_PTR_ALLOC *
1336 {
1337  RT_NODE_48 *n48 = (RT_NODE_48 *) node.local;
1338  int insertpos;
1339  int idx = 0;
1340  bitmapword w,
1341  inverse;
1342 
1343  /* get the first word with at least one bit not set */
1344  for (int i = 0; i < RT_BM_IDX(RT_FANOUT_48_MAX); i++)
1345  {
1346  w = n48->isset[i];
1347  if (w < ~((bitmapword) 0))
1348  {
1349  idx = i;
1350  break;
1351  }
1352  }
1353 
1354  /* To get the first unset bit in w, get the first set bit in ~w */
1355  inverse = ~w;
1356  insertpos = idx * BITS_PER_BITMAPWORD;
1357  insertpos += bmw_rightmost_one_pos(inverse);
1358  Assert(insertpos < n48->base.fanout);
1359 
1360  /* mark the slot used by setting the rightmost zero bit */
1361  n48->isset[idx] |= w + 1;
1362 
1363  /* insert new chunk into place */
1364  n48->slot_idxs[chunk] = insertpos;
1365 
1366  n48->base.count++;
1367  RT_VERIFY_NODE((RT_NODE *) n48);
1368 
1369  return &n48->children[insertpos];
1370 }
1371 
1372 static pg_noinline RT_PTR_ALLOC *
1374  uint8 chunk)
1375 {
1376  RT_NODE_16 *n16 = (RT_NODE_16 *) node.local;
1377  int insertpos;
1378 
1379  if (n16->base.fanout < RT_FANOUT_16_HI)
1380  {
1381  RT_CHILD_PTR newnode;
1382  RT_NODE_16 *new16;
1383 
1384  Assert(n16->base.fanout == RT_FANOUT_16_LO);
1385 
1386  /* initialize new node */
1388  new16 = (RT_NODE_16 *) newnode.local;
1389 
1390  /* copy over existing entries */
1391  RT_COPY_COMMON(newnode, node);
1392  Assert(n16->base.count == RT_FANOUT_16_LO);
1393  insertpos = RT_NODE_16_GET_INSERTPOS(n16, chunk);
1394  RT_COPY_ARRAYS_FOR_INSERT(new16->chunks, new16->children,
1395  n16->chunks, n16->children,
1396  RT_FANOUT_16_LO, insertpos);
1397 
1398  /* insert new chunk into place */
1399  new16->chunks[insertpos] = chunk;
1400 
1401  new16->base.count++;
1402  RT_VERIFY_NODE((RT_NODE *) new16);
1403 
1404  /* free old node and update references */
1405  RT_FREE_NODE(tree, node);
1406  *parent_slot = newnode.alloc;
1407 
1408  return &new16->children[insertpos];
1409  }
1410  else
1411  {
1412  RT_CHILD_PTR newnode;
1413  RT_NODE_48 *new48;
1414  int idx,
1415  bit;
1416 
1417  Assert(n16->base.fanout == RT_FANOUT_16_HI);
1418 
1419  /* initialize new node */
1421  new48 = (RT_NODE_48 *) newnode.local;
1422 
1423  /* copy over the entries */
1424  RT_COPY_COMMON(newnode, node);
1425  for (int i = 0; i < RT_FANOUT_16_HI; i++)
1426  new48->slot_idxs[n16->chunks[i]] = i;
1427  memcpy(&new48->children[0], &n16->children[0], RT_FANOUT_16_HI * sizeof(new48->children[0]));
1428 
1429  /*
1430  * Since we just copied a dense array, we can fill "isset" using a
1431  * single store, provided the length of that array is at most the
1432  * number of bits in a bitmapword.
1433  */
1435  new48->isset[0] = (bitmapword) (((uint64) 1 << RT_FANOUT_16_HI) - 1);
1436 
1437  /* put the new child at the end of the copied entries */
1438  insertpos = RT_FANOUT_16_HI;
1439  idx = RT_BM_IDX(insertpos);
1440  bit = RT_BM_BIT(insertpos);
1441 
1442  /* mark the slot used */
1443  new48->isset[idx] |= ((bitmapword) 1 << bit);
1444 
1445  /* insert new chunk into place */
1446  new48->slot_idxs[chunk] = insertpos;
1447 
1448  new48->base.count++;
1449  RT_VERIFY_NODE((RT_NODE *) new48);
1450 
1451  /* free old node and update reference in parent */
1452  *parent_slot = newnode.alloc;
1453  RT_FREE_NODE(tree, node);
1454 
1455  return &new48->children[insertpos];
1456  }
1457 }
1458 
1459 static inline RT_PTR_ALLOC *
1461 {
1462  RT_NODE_16 *n16 = (RT_NODE_16 *) node.local;
1463  int insertpos = RT_NODE_16_GET_INSERTPOS(n16, chunk);
1464 
1465  /* shift chunks and children */
1467  n16->base.count, insertpos);
1468 
1469  /* insert new chunk into place */
1470  n16->chunks[insertpos] = chunk;
1471 
1472  n16->base.count++;
1473  RT_VERIFY_NODE((RT_NODE *) n16);
1474 
1475  return &n16->children[insertpos];
1476 }
1477 
1478 static pg_noinline RT_PTR_ALLOC *
1480  uint8 chunk)
1481 {
1482  RT_NODE_4 *n4 = (RT_NODE_4 *) (node.local);
1483  RT_CHILD_PTR newnode;
1484  RT_NODE_16 *new16;
1485  int insertpos;
1486 
1487  /* initialize new node */
1489  new16 = (RT_NODE_16 *) newnode.local;
1490 
1491  /* copy over existing entries */
1492  RT_COPY_COMMON(newnode, node);
1493  Assert(n4->base.count == RT_FANOUT_4);
1494  insertpos = RT_NODE_4_GET_INSERTPOS(n4, chunk, RT_FANOUT_4);
1495  RT_COPY_ARRAYS_FOR_INSERT(new16->chunks, new16->children,
1496  n4->chunks, n4->children,
1497  RT_FANOUT_4, insertpos);
1498 
1499  /* insert new chunk into place */
1500  new16->chunks[insertpos] = chunk;
1501 
1502  new16->base.count++;
1503  RT_VERIFY_NODE((RT_NODE *) new16);
1504 
1505  /* free old node and update reference in parent */
1506  *parent_slot = newnode.alloc;
1507  RT_FREE_NODE(tree, node);
1508 
1509  return &new16->children[insertpos];
1510 }
1511 
1512 static inline RT_PTR_ALLOC *
1514 {
1515  RT_NODE_4 *n4 = (RT_NODE_4 *) (node.local);
1516  int count = n4->base.count;
1517  int insertpos = RT_NODE_4_GET_INSERTPOS(n4, chunk, count);
1518 
1519  /* shift chunks and children */
1521  count, insertpos);
1522 
1523  /* insert new chunk into place */
1524  n4->chunks[insertpos] = chunk;
1525 
1526  n4->base.count++;
1527  RT_VERIFY_NODE((RT_NODE *) n4);
1528 
1529  return &n4->children[insertpos];
1530 }
1531 
1532 /*
1533  * Reserve slot in "node"'s child array. The caller will populate it
1534  * with the actual child pointer.
1535  *
1536  * "parent_slot" is the address of the parent's child pointer to "node".
1537  * If the node we're inserting into needs to grow, we update the parent's
1538  * child pointer with the pointer to the new larger node.
1539  */
1540 static inline RT_PTR_ALLOC *
1542  uint8 chunk)
1543 {
1544  RT_NODE *n = node.local;
1545 
1546  switch (n->kind)
1547  {
1548  case RT_NODE_KIND_4:
1549  {
1550  if (unlikely(RT_NODE_MUST_GROW(n)))
1551  return RT_GROW_NODE_4(tree, parent_slot, node, chunk);
1552 
1553  return RT_ADD_CHILD_4(tree, node, chunk);
1554  }
1555  case RT_NODE_KIND_16:
1556  {
1557  if (unlikely(RT_NODE_MUST_GROW(n)))
1558  return RT_GROW_NODE_16(tree, parent_slot, node, chunk);
1559 
1560  return RT_ADD_CHILD_16(tree, node, chunk);
1561  }
1562  case RT_NODE_KIND_48:
1563  {
1564  if (unlikely(RT_NODE_MUST_GROW(n)))
1565  return RT_GROW_NODE_48(tree, parent_slot, node, chunk);
1566 
1567  return RT_ADD_CHILD_48(tree, node, chunk);
1568  }
1569  case RT_NODE_KIND_256:
1570  return RT_ADD_CHILD_256(tree, node, chunk);
1571  default:
1572  pg_unreachable();
1573  }
1574 }
1575 
1576 /*
1577  * The radix tree doesn't have sufficient height. Put new node(s) on top,
1578  * and move the old node below it.
1579  */
1580 static pg_noinline void
1582 {
1583  int target_shift = RT_KEY_GET_SHIFT(key);
1584  int shift = tree->ctl->start_shift;
1585 
1586  Assert(shift < target_shift);
1587 
1588  /* Grow tree upwards until start shift can accommodate the key */
1589  while (shift < target_shift)
1590  {
1591  RT_CHILD_PTR node;
1592  RT_NODE_4 *n4;
1593 
1595  n4 = (RT_NODE_4 *) node.local;
1596  n4->base.count = 1;
1597  n4->chunks[0] = 0;
1598  n4->children[0] = tree->ctl->root;
1599 
1600  /* Update the root */
1601  tree->ctl->root = node.alloc;
1602 
1603  shift += RT_SPAN;
1604  }
1605 
1606  tree->ctl->max_val = RT_SHIFT_GET_MAX_VAL(target_shift);
1607  tree->ctl->start_shift = target_shift;
1608 }
1609 
1610 /*
1611  * Insert a chain of nodes until we reach the lowest level,
1612  * and return the address of a slot to be filled further up
1613  * the call stack.
1614  */
1615 static pg_noinline RT_PTR_ALLOC *
1617 {
1618  RT_CHILD_PTR node,
1619  child;
1620  RT_NODE_4 *n4;
1621 
1622  /*
1623  * The child pointer of the first node in the chain goes in the
1624  * caller-provided slot.
1625  */
1627  *parent_slot = child.alloc;
1628 
1629  node = child;
1630  shift -= RT_SPAN;
1631 
1632  while (shift > 0)
1633  {
1635 
1636  /* We open-code the insertion ourselves, for speed. */
1637  n4 = (RT_NODE_4 *) node.local;
1638  n4->base.count = 1;
1639  n4->chunks[0] = RT_GET_KEY_CHUNK(key, shift);
1640  n4->children[0] = child.alloc;
1641 
1642  node = child;
1643  shift -= RT_SPAN;
1644  }
1645  Assert(shift == 0);
1646 
1647  /* Reserve slot for the value. */
1648  n4 = (RT_NODE_4 *) node.local;
1649  n4->chunks[0] = RT_GET_KEY_CHUNK(key, 0);
1650  n4->base.count = 1;
1651 
1652  return &n4->children[0];
1653 }
1654 
1655 /*
1656  * Workhorse for RT_SET
1657  *
1658  * "parent_slot" is the address of the child pointer we just followed,
1659  * in the parent's array of children, needed if inserting into the
1660  * current node causes it to grow.
1661  */
1662 static RT_PTR_ALLOC *
1663 RT_GET_SLOT_RECURSIVE(RT_RADIX_TREE * tree, RT_PTR_ALLOC * parent_slot, uint64 key, int shift, bool *found)
1664 {
1665  RT_PTR_ALLOC *slot;
1666  RT_CHILD_PTR node;
1667  uint8 chunk = RT_GET_KEY_CHUNK(key, shift);
1668 
1669  node.alloc = *parent_slot;
1670  RT_PTR_SET_LOCAL(tree, &node);
1671  slot = RT_NODE_SEARCH(node.local, chunk);
1672 
1673  if (slot == NULL)
1674  {
1675  *found = false;
1676 
1677  /* reserve slot for the caller to populate */
1678 
1679  slot = RT_NODE_INSERT(tree, parent_slot, node, chunk);
1680 
1681  if (shift == 0)
1682  return slot;
1683  else
1684  return RT_EXTEND_DOWN(tree, slot, key, shift);
1685  }
1686  else
1687  {
1688  if (shift == 0)
1689  {
1690  *found = true;
1691  return slot;
1692  }
1693  else
1694  return RT_GET_SLOT_RECURSIVE(tree, slot, key, shift - RT_SPAN, found);
1695  }
1696 }
1697 
1698 /*
1699  * Set key to value that value_p points to. If the entry already exists, we
1700  * update its value and return true. Returns false if entry doesn't yet exist.
1701  *
1702  * Taking a lock in exclusive mode is the caller's responsibility.
1703  */
1704 RT_SCOPE bool
1706 {
1707  bool found;
1708  RT_PTR_ALLOC *slot;
1709  size_t value_sz = RT_GET_VALUE_SIZE(value_p);
1710 
1711 #ifdef RT_SHMEM
1712  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1713 #endif
1714 
1715  Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root));
1716 
1717  /* Extend the tree if necessary */
1718  if (unlikely(key > tree->ctl->max_val))
1719  {
1720  if (tree->ctl->num_keys == 0)
1721  {
1722  RT_CHILD_PTR node;
1723  RT_NODE_4 *n4;
1725 
1726  /*
1727  * With an empty root node, we don't extend the tree upwards,
1728  * since that would result in orphan empty nodes. Instead we open
1729  * code inserting into the root node and extend downward from
1730  * there.
1731  */
1732  node.alloc = tree->ctl->root;
1733  RT_PTR_SET_LOCAL(tree, &node);
1734  n4 = (RT_NODE_4 *) node.local;
1735  n4->base.count = 1;
1737 
1738  slot = RT_EXTEND_DOWN(tree, &n4->children[0], key, start_shift);
1739  found = false;
1740  tree->ctl->start_shift = start_shift;
1741  tree->ctl->max_val = RT_SHIFT_GET_MAX_VAL(start_shift);
1742  goto have_slot;
1743  }
1744  else
1745  RT_EXTEND_UP(tree, key);
1746  }
1747 
1748  slot = RT_GET_SLOT_RECURSIVE(tree, &tree->ctl->root,
1749  key, tree->ctl->start_shift, &found);
1750 
1751 have_slot:
1752  Assert(slot != NULL);
1753 
1754  if (RT_VALUE_IS_EMBEDDABLE(value_p))
1755  {
1756  /* free the existing leaf */
1757  if (found && !RT_CHILDPTR_IS_VALUE(*slot))
1758  RT_FREE_LEAF(tree, *slot);
1759 
1760  /* store value directly in child pointer slot */
1761  memcpy(slot, value_p, value_sz);
1762 
1763 #ifdef RT_RUNTIME_EMBEDDABLE_VALUE
1764  /* tag child pointer */
1765 #ifdef RT_SHMEM
1766  *slot |= 1;
1767 #else
1768  *((uintptr_t *) slot) |= 1;
1769 #endif
1770 #endif
1771  }
1772  else
1773  {
1774  RT_CHILD_PTR leaf;
1775 
1776  if (found && !RT_CHILDPTR_IS_VALUE(*slot))
1777  {
1778  Assert(RT_PTR_ALLOC_IS_VALID(*slot));
1779  leaf.alloc = *slot;
1780  RT_PTR_SET_LOCAL(tree, &leaf);
1781 
1782  if (RT_GET_VALUE_SIZE((RT_VALUE_TYPE *) leaf.local) != value_sz)
1783  {
1784  /*
1785  * different sizes, so first free the existing leaf before
1786  * allocating a new one
1787  */
1788  RT_FREE_LEAF(tree, *slot);
1789  leaf = RT_ALLOC_LEAF(tree, value_sz);
1790  *slot = leaf.alloc;
1791  }
1792  }
1793  else
1794  {
1795  /* allocate new leaf and store it in the child array */
1796  leaf = RT_ALLOC_LEAF(tree, value_sz);
1797  *slot = leaf.alloc;
1798  }
1799 
1800  memcpy(leaf.local, value_p, value_sz);
1801  }
1802 
1803  /* Update the statistics */
1804  if (!found)
1805  tree->ctl->num_keys++;
1806 
1807  return found;
1808 }
1809 
1810 /***************** SETUP / TEARDOWN *****************/
1811 
1812 /*
1813  * Create the radix tree in the given memory context and return it.
1814  *
1815  * All local memory required for a radix tree is allocated in the given
1816  * memory context and its children. Note that RT_FREE() will delete all
1817  * allocated space within the given memory context, so the dsa_area should
1818  * be created in a different context.
1819  */
1821 #ifdef RT_SHMEM
1822 RT_CREATE(MemoryContext ctx, dsa_area *dsa, int tranche_id)
1823 #else
1825 #endif
1826 {
1830 #ifdef RT_SHMEM
1831  dsa_pointer dp;
1832 #endif
1833 
1835 
1837  tree->context = ctx;
1838 
1839  /*
1840  * Separate context for iteration in case the tree context doesn't support
1841  * pfree
1842  */
1843  tree->iter_context = AllocSetContextCreate(ctx,
1844  RT_STR(RT_PREFIX) "_radix_tree iter context",
1846 
1847 #ifdef RT_SHMEM
1848  tree->dsa = dsa;
1849  dp = dsa_allocate0(dsa, sizeof(RT_RADIX_TREE_CONTROL));
1850  tree->ctl = (RT_RADIX_TREE_CONTROL *) dsa_get_address(dsa, dp);
1851  tree->ctl->handle = dp;
1852  tree->ctl->magic = RT_RADIX_TREE_MAGIC;
1853  LWLockInitialize(&tree->ctl->lock, tranche_id);
1854 #else
1856 
1857  /* Create a slab context for each size class */
1858  for (int i = 0; i < RT_NUM_SIZE_CLASSES; i++)
1859  {
1860  RT_SIZE_CLASS_ELEM size_class = RT_SIZE_CLASS_INFO[i];
1861  size_t inner_blocksize = RT_SLAB_BLOCK_SIZE(size_class.allocsize);
1862 
1863  tree->node_slabs[i] = SlabContextCreate(ctx,
1864  size_class.name,
1865  inner_blocksize,
1866  size_class.allocsize);
1867  }
1868 
1869  /* By default we use the passed context for leaves. */
1870  tree->leaf_context = tree->context;
1871 
1872 #ifndef RT_VARLEN_VALUE_SIZE
1873 
1874  /*
1875  * For leaves storing fixed-length values, we use a slab context to avoid
1876  * the possibility of space wastage by power-of-2 rounding up.
1877  */
1878  if (sizeof(RT_VALUE_TYPE) > sizeof(RT_PTR_ALLOC))
1879  tree->leaf_context = SlabContextCreate(ctx,
1880  RT_STR(RT_PREFIX) "_radix_tree leaf context",
1882  sizeof(RT_VALUE_TYPE));
1883 #endif /* !RT_VARLEN_VALUE_SIZE */
1884 #endif /* RT_SHMEM */
1885 
1886  /* add root node now so that RT_SET can assume it exists */
1888  tree->ctl->root = rootnode.alloc;
1889  tree->ctl->start_shift = 0;
1890  tree->ctl->max_val = RT_SHIFT_GET_MAX_VAL(0);
1891 
1893 
1894  return tree;
1895 }
1896 
1897 #ifdef RT_SHMEM
1899 RT_ATTACH(dsa_area *dsa, RT_HANDLE handle)
1900 {
1902  dsa_pointer control;
1903 
1904  tree = (RT_RADIX_TREE *) palloc0(sizeof(RT_RADIX_TREE));
1905 
1906  /* Find the control object in shared memory */
1907  control = handle;
1908 
1909  tree->dsa = dsa;
1910  tree->ctl = (RT_RADIX_TREE_CONTROL *) dsa_get_address(dsa, control);
1911  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1912 
1913  return tree;
1914 }
1915 
1916 RT_SCOPE void
1917 RT_DETACH(RT_RADIX_TREE * tree)
1918 {
1919  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1920  pfree(tree);
1921 }
1922 
1923 RT_SCOPE RT_HANDLE
1924 RT_GET_HANDLE(RT_RADIX_TREE * tree)
1925 {
1926  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1927  return tree->ctl->handle;
1928 }
1929 
1930 RT_SCOPE void
1931 RT_LOCK_EXCLUSIVE(RT_RADIX_TREE * tree)
1932 {
1933  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1934  LWLockAcquire(&tree->ctl->lock, LW_EXCLUSIVE);
1935 }
1936 
1937 RT_SCOPE void
1938 RT_LOCK_SHARE(RT_RADIX_TREE * tree)
1939 {
1940  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1941  LWLockAcquire(&tree->ctl->lock, LW_SHARED);
1942 }
1943 
1944 RT_SCOPE void
1945 RT_UNLOCK(RT_RADIX_TREE * tree)
1946 {
1947  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
1948  LWLockRelease(&tree->ctl->lock);
1949 }
1950 
1951 /*
1952  * Recursively free all nodes allocated in the DSA area.
1953  */
1954 static void
1956 {
1957  RT_CHILD_PTR node;
1958 
1960 
1961  node.alloc = ptr;
1962  RT_PTR_SET_LOCAL(tree, &node);
1963 
1964  switch (node.local->kind)
1965  {
1966  case RT_NODE_KIND_4:
1967  {
1968  RT_NODE_4 *n4 = (RT_NODE_4 *) node.local;
1969 
1970  for (int i = 0; i < n4->base.count; i++)
1971  {
1972  RT_PTR_ALLOC child = n4->children[i];
1973 
1974  if (shift > 0)
1975  RT_FREE_RECURSE(tree, child, shift - RT_SPAN);
1976  else if (!RT_CHILDPTR_IS_VALUE(child))
1977  dsa_free(tree->dsa, child);
1978  }
1979 
1980  break;
1981  }
1982  case RT_NODE_KIND_16:
1983  {
1984  RT_NODE_16 *n16 = (RT_NODE_16 *) node.local;
1985 
1986  for (int i = 0; i < n16->base.count; i++)
1987  {
1988  RT_PTR_ALLOC child = n16->children[i];
1989 
1990  if (shift > 0)
1991  RT_FREE_RECURSE(tree, child, shift - RT_SPAN);
1992  else if (!RT_CHILDPTR_IS_VALUE(child))
1993  dsa_free(tree->dsa, child);
1994  }
1995 
1996  break;
1997  }
1998  case RT_NODE_KIND_48:
1999  {
2000  RT_NODE_48 *n48 = (RT_NODE_48 *) node.local;
2001 
2002  for (int i = 0; i < RT_NODE_MAX_SLOTS; i++)
2003  {
2004  RT_PTR_ALLOC child;
2005 
2006  if (!RT_NODE_48_IS_CHUNK_USED(n48, i))
2007  continue;
2008 
2009  child = *RT_NODE_48_GET_CHILD(n48, i);
2010 
2011  if (shift > 0)
2012  RT_FREE_RECURSE(tree, child, shift - RT_SPAN);
2013  else if (!RT_CHILDPTR_IS_VALUE(child))
2014  dsa_free(tree->dsa, child);
2015  }
2016 
2017  break;
2018  }
2019  case RT_NODE_KIND_256:
2020  {
2021  RT_NODE_256 *n256 = (RT_NODE_256 *) node.local;
2022 
2023  for (int i = 0; i < RT_NODE_MAX_SLOTS; i++)
2024  {
2025  RT_PTR_ALLOC child;
2026 
2027  if (!RT_NODE_256_IS_CHUNK_USED(n256, i))
2028  continue;
2029 
2030  child = *RT_NODE_256_GET_CHILD(n256, i);
2031 
2032  if (shift > 0)
2033  RT_FREE_RECURSE(tree, child, shift - RT_SPAN);
2034  else if (!RT_CHILDPTR_IS_VALUE(child))
2035  dsa_free(tree->dsa, child);
2036  }
2037 
2038  break;
2039  }
2040  }
2041 
2042  /* Free the inner node */
2043  dsa_free(tree->dsa, ptr);
2044 }
2045 #endif
2046 
2047 /*
2048  * Free the radix tree, including all nodes and leaves.
2049  */
2050 RT_SCOPE void
2052 {
2053 #ifdef RT_SHMEM
2054  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
2055 
2056  /* Free all memory used for radix tree nodes */
2057  Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root));
2058  RT_FREE_RECURSE(tree, tree->ctl->root, tree->ctl->start_shift);
2059 
2060  /*
2061  * Vandalize the control block to help catch programming error where other
2062  * backends access the memory formerly occupied by this radix tree.
2063  */
2064  tree->ctl->magic = 0;
2065  dsa_free(tree->dsa, tree->ctl->handle);
2066 #endif
2067 
2068  /*
2069  * Free all space allocated within the tree's context and delete all child
2070  * contexts such as those used for nodes.
2071  */
2072  MemoryContextReset(tree->context);
2073 }
2074 
2075 /***************** ITERATION *****************/
2076 
2077 /*
2078  * Create and return the iterator for the given radix tree.
2079  *
2080  * Taking a lock in shared mode during the iteration is the caller's
2081  * responsibility.
2082  */
2083 RT_SCOPE RT_ITER *
2085 {
2086  RT_ITER *iter;
2088 
2089  iter = (RT_ITER *) MemoryContextAllocZero(tree->iter_context,
2090  sizeof(RT_ITER));
2091  iter->tree = tree;
2092 
2093  Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root));
2094  root.alloc = iter->tree->ctl->root;
2096 
2097  iter->top_level = iter->tree->ctl->start_shift / RT_SPAN;
2098 
2099  /* Set the root to start */
2100  iter->cur_level = iter->top_level;
2101  iter->node_iters[iter->cur_level].node = root;
2102  iter->node_iters[iter->cur_level].idx = 0;
2103 
2104  return iter;
2105 }
2106 
2107 /*
2108  * Scan the inner node and return the next child pointer if one exists, otherwise
2109  * return NULL.
2110  */
2111 static inline RT_PTR_ALLOC *
2112 RT_NODE_ITERATE_NEXT(RT_ITER * iter, int level)
2113 {
2114  uint8 key_chunk = 0;
2115  RT_NODE_ITER *node_iter;
2116  RT_CHILD_PTR node;
2117  RT_PTR_ALLOC *slot = NULL;
2118 
2119 #ifdef RT_SHMEM
2120  Assert(iter->tree->ctl->magic == RT_RADIX_TREE_MAGIC);
2121 #endif
2122 
2123  node_iter = &(iter->node_iters[level]);
2124  node = node_iter->node;
2125 
2126  Assert(node.local != NULL);
2127 
2128  switch ((node.local)->kind)
2129  {
2130  case RT_NODE_KIND_4:
2131  {
2132  RT_NODE_4 *n4 = (RT_NODE_4 *) (node.local);
2133 
2134  if (node_iter->idx >= n4->base.count)
2135  return NULL;
2136 
2137  slot = &n4->children[node_iter->idx];
2138  key_chunk = n4->chunks[node_iter->idx];
2139  node_iter->idx++;
2140  break;
2141  }
2142  case RT_NODE_KIND_16:
2143  {
2144  RT_NODE_16 *n16 = (RT_NODE_16 *) (node.local);
2145 
2146  if (node_iter->idx >= n16->base.count)
2147  return NULL;
2148 
2149  slot = &n16->children[node_iter->idx];
2150  key_chunk = n16->chunks[node_iter->idx];
2151  node_iter->idx++;
2152  break;
2153  }
2154  case RT_NODE_KIND_48:
2155  {
2156  RT_NODE_48 *n48 = (RT_NODE_48 *) (node.local);
2157  int chunk;
2158 
2159  for (chunk = node_iter->idx; chunk < RT_NODE_MAX_SLOTS; chunk++)
2160  {
2161  if (RT_NODE_48_IS_CHUNK_USED(n48, chunk))
2162  break;
2163  }
2164 
2165  if (chunk >= RT_NODE_MAX_SLOTS)
2166  return NULL;
2167 
2168  slot = RT_NODE_48_GET_CHILD(n48, chunk);
2169 
2170  key_chunk = chunk;
2171  node_iter->idx = chunk + 1;
2172  break;
2173  }
2174  case RT_NODE_KIND_256:
2175  {
2176  RT_NODE_256 *n256 = (RT_NODE_256 *) (node.local);
2177  int chunk;
2178 
2179  for (chunk = node_iter->idx; chunk < RT_NODE_MAX_SLOTS; chunk++)
2180  {
2181  if (RT_NODE_256_IS_CHUNK_USED(n256, chunk))
2182  break;
2183  }
2184 
2185  if (chunk >= RT_NODE_MAX_SLOTS)
2186  return NULL;
2187 
2188  slot = RT_NODE_256_GET_CHILD(n256, chunk);
2189 
2190  key_chunk = chunk;
2191  node_iter->idx = chunk + 1;
2192  break;
2193  }
2194  }
2195 
2196  /* Update the key */
2197  iter->key &= ~(((uint64) RT_CHUNK_MASK) << (level * RT_SPAN));
2198  iter->key |= (((uint64) key_chunk) << (level * RT_SPAN));
2199 
2200  return slot;
2201 }
2202 
2203 /*
2204  * Return pointer to value and set key_p as long as there is a key. Otherwise
2205  * return NULL.
2206  */
2209 {
2210  RT_PTR_ALLOC *slot = NULL;
2211 
2212  while (iter->cur_level <= iter->top_level)
2213  {
2214  RT_CHILD_PTR node;
2215 
2216  slot = RT_NODE_ITERATE_NEXT(iter, iter->cur_level);
2217 
2218  if (iter->cur_level == 0 && slot != NULL)
2219  {
2220  /* Found a value at the leaf node */
2221  *key_p = iter->key;
2222  node.alloc = *slot;
2223 
2224  if (RT_CHILDPTR_IS_VALUE(*slot))
2225  return (RT_VALUE_TYPE *) slot;
2226  else
2227  {
2228  RT_PTR_SET_LOCAL(iter->tree, &node);
2229  return (RT_VALUE_TYPE *) node.local;
2230  }
2231  }
2232 
2233  if (slot != NULL)
2234  {
2235  /* Found the child slot, move down the tree */
2236  node.alloc = *slot;
2237  RT_PTR_SET_LOCAL(iter->tree, &node);
2238 
2239  iter->cur_level--;
2240  iter->node_iters[iter->cur_level].node = node;
2241  iter->node_iters[iter->cur_level].idx = 0;
2242  }
2243  else
2244  {
2245  /* Not found the child slot, move up the tree */
2246  iter->cur_level++;
2247  }
2248  }
2249 
2250  /* We've visited all nodes, so the iteration finished */
2251  return NULL;
2252 }
2253 
2254 /*
2255  * Terminate the iteration. The caller is responsible for releasing any locks.
2256  */
2257 RT_SCOPE void
2259 {
2260  pfree(iter);
2261 }
2262 
2263 /***************** DELETION *****************/
2264 
2265 #ifdef RT_USE_DELETE
2266 
2267 /* Delete the element at "deletepos" */
2268 static inline void
2269 RT_SHIFT_ARRAYS_AND_DELETE(uint8 *chunks, RT_PTR_ALLOC * children, int count, int deletepos)
2270 {
2271  /*
2272  * This is basically a memmove, but written in a simple loop for speed on
2273  * small inputs.
2274  */
2275  for (int i = deletepos; i < count - 1; i++)
2276  {
2277  /* workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481 */
2278 #ifdef __GNUC__
2279  __asm__("");
2280 #endif
2281  chunks[i] = chunks[i + 1];
2282  children[i] = children[i + 1];
2283  }
2284 }
2285 
2286 /*
2287  * Copy both chunk and slot arrays into the right
2288  * place. The element at "deletepos" is deleted by skipping it.
2289  */
2290 static inline void
2291 RT_COPY_ARRAYS_AND_DELETE(uint8 *dst_chunks, RT_PTR_ALLOC * dst_children,
2292  uint8 *src_chunks, RT_PTR_ALLOC * src_children,
2293  int count, int deletepos)
2294 {
2295  for (int i = 0; i < count - 1; i++)
2296  {
2297  /*
2298  * use a branch-free computation to skip the index of the deleted
2299  * element
2300  */
2301  int sourceidx = i + (i >= deletepos);
2302  int destidx = i;
2303 
2304  dst_chunks[destidx] = src_chunks[sourceidx];
2305  dst_children[destidx] = src_children[sourceidx];
2306  }
2307 }
2308 
2309 /*
2310  * Note: While all node-growing functions are called to perform an insertion
2311  * when no more space is available, shrinking is not a hard-and-fast requirement.
2312  * When shrinking nodes, we generally wait until the count is about 3/4 of
2313  * the next lower node's fanout. This prevents ping-ponging between different
2314  * node sizes.
2315  *
2316  * Some shrinking functions delete first and then shrink, either because we
2317  * must or because it's fast and simple that way. Sometimes it's faster to
2318  * delete while shrinking.
2319  */
2320 
2321 /*
2322  * Move contents of a node256 to a node48. Any deletion should have happened
2323  * in the caller.
2324  */
2325 static void pg_noinline
2327 {
2328  RT_NODE_256 *n256 = (RT_NODE_256 *) node.local;
2329  RT_CHILD_PTR newnode;
2330  RT_NODE_48 *new48;
2331  int slot_idx = 0;
2332 
2333  /* initialize new node */
2335  new48 = (RT_NODE_48 *) newnode.local;
2336 
2337  /* copy over the entries */
2338  RT_COPY_COMMON(newnode, node);
2339  for (int i = 0; i < RT_NODE_MAX_SLOTS; i++)
2340  {
2341  if (RT_NODE_256_IS_CHUNK_USED(n256, i))
2342  {
2343  new48->slot_idxs[i] = slot_idx;
2344  new48->children[slot_idx] = n256->children[i];
2345  slot_idx++;
2346  }
2347  }
2348 
2349  /*
2350  * Since we just copied a dense array, we can fill "isset" using a single
2351  * store, provided the length of that array is at most the number of bits
2352  * in a bitmapword.
2353  */
2355  new48->isset[0] = (bitmapword) (((uint64) 1 << n256->base.count) - 1);
2356 
2357  /* free old node and update reference in parent */
2358  *parent_slot = newnode.alloc;
2359  RT_FREE_NODE(tree, node);
2360 }
2361 
2362 static inline void
2364 {
2365  int shrink_threshold;
2366  RT_NODE_256 *n256 = (RT_NODE_256 *) node.local;
2367  int idx = RT_BM_IDX(chunk);
2368  int bitnum = RT_BM_BIT(chunk);
2369 
2370  /* Mark the slot free for "chunk" */
2371  n256->isset[idx] &= ~((bitmapword) 1 << bitnum);
2372 
2373  n256->base.count--;
2374 
2375  /*
2376  * A full node256 will have a count of zero because of overflow, so we
2377  * delete first before checking the shrink threshold.
2378  */
2379  Assert(n256->base.count > 0);
2380 
2381  /* This simplifies RT_SHRINK_NODE_256() */
2382  shrink_threshold = BITS_PER_BITMAPWORD;
2383  shrink_threshold = Min(RT_FANOUT_48 / 4 * 3, shrink_threshold);
2384 
2385  if (n256->base.count <= shrink_threshold)
2386  RT_SHRINK_NODE_256(tree, parent_slot, node, chunk);
2387 }
2388 
2389 /*
2390  * Move contents of a node48 to a node16. Any deletion should have happened
2391  * in the caller.
2392  */
2393 static void pg_noinline
2395 {
2396  RT_NODE_48 *n48 = (RT_NODE_48 *) (node.local);
2397  RT_CHILD_PTR newnode;
2398  RT_NODE_16 *new16;
2399  int destidx = 0;
2400 
2401  /*
2402  * Initialize new node. For now we skip the larger node16 size class for
2403  * simplicity.
2404  */
2406  new16 = (RT_NODE_16 *) newnode.local;
2407 
2408  /* copy over all existing entries */
2409  RT_COPY_COMMON(newnode, node);
2410  for (int chunk = 0; chunk < RT_NODE_MAX_SLOTS; chunk++)
2411  {
2412  if (n48->slot_idxs[chunk] != RT_INVALID_SLOT_IDX)
2413  {
2414  new16->chunks[destidx] = chunk;
2415  new16->children[destidx] = n48->children[n48->slot_idxs[chunk]];
2416  destidx++;
2417  }
2418  }
2419 
2420  Assert(destidx < new16->base.fanout);
2421 
2422  RT_VERIFY_NODE((RT_NODE *) new16);
2423 
2424  /* free old node and update reference in parent */
2425  *parent_slot = newnode.alloc;
2426  RT_FREE_NODE(tree, node);
2427 }
2428 
2429 static inline void
2431 {
2432  RT_NODE_48 *n48 = (RT_NODE_48 *) node.local;
2433  int deletepos = n48->slot_idxs[chunk];
2434 
2435  /* For now we skip the larger node16 size class for simplicity */
2436  int shrink_threshold = RT_FANOUT_16_LO / 4 * 3;
2437  int idx;
2438  int bitnum;
2439 
2440  Assert(deletepos != RT_INVALID_SLOT_IDX);
2441 
2442  idx = RT_BM_IDX(deletepos);
2443  bitnum = RT_BM_BIT(deletepos);
2444  n48->isset[idx] &= ~((bitmapword) 1 << bitnum);
2446 
2447  n48->base.count--;
2448 
2449  /*
2450  * To keep shrinking simple, do it after deleting, which is fast for
2451  * node48 anyway.
2452  */
2453  if (n48->base.count <= shrink_threshold)
2454  RT_SHRINK_NODE_48(tree, parent_slot, node, chunk);
2455 }
2456 
2457 /*
2458  * Move contents of a node16 to a node4, and delete the one at "deletepos".
2459  * By deleting as we move, we can avoid memmove operations in the new
2460  * node.
2461  */
2462 static void pg_noinline
2463 RT_SHRINK_NODE_16(RT_RADIX_TREE * tree, RT_PTR_ALLOC * parent_slot, RT_CHILD_PTR node, uint8 deletepos)
2464 {
2465  RT_NODE_16 *n16 = (RT_NODE_16 *) (node.local);
2466  RT_CHILD_PTR newnode;
2467  RT_NODE_4 *new4;
2468 
2469  /* initialize new node */
2471  new4 = (RT_NODE_4 *) newnode.local;
2472 
2473  /* copy over existing entries, except for the one at "deletepos" */
2474  RT_COPY_COMMON(newnode, node);
2476  n16->chunks, n16->children,
2477  n16->base.count, deletepos);
2478 
2479  new4->base.count--;
2480  RT_VERIFY_NODE((RT_NODE *) new4);
2481 
2482  /* free old node and update reference in parent */
2483  *parent_slot = newnode.alloc;
2484  RT_FREE_NODE(tree, node);
2485 }
2486 
2487 static inline void
2489 {
2490  RT_NODE_16 *n16 = (RT_NODE_16 *) node.local;
2491  int deletepos = slot - n16->children;
2492 
2493  /*
2494  * When shrinking to node4, 4 is hard-coded. After shrinking, the new node
2495  * will end up with 3 elements and 3 is the largest count where linear
2496  * search is faster than SIMD, at least on x86-64.
2497  */
2498  if (n16->base.count <= 4)
2499  {
2500  RT_SHRINK_NODE_16(tree, parent_slot, node, deletepos);
2501  return;
2502  }
2503 
2504  Assert(deletepos >= 0);
2505  Assert(n16->chunks[deletepos] == chunk);
2506 
2508  n16->base.count, deletepos);
2509  n16->base.count--;
2510 }
2511 
2512 static inline void
2514 {
2515  RT_NODE_4 *n4 = (RT_NODE_4 *) node.local;
2516 
2517  if (n4->base.count == 1)
2518  {
2519  Assert(n4->chunks[0] == chunk);
2520 
2521  /*
2522  * If we're deleting the last entry from the root child node don't
2523  * free it, but mark both the tree and the root child node empty. That
2524  * way, RT_SET can assume it exists.
2525  */
2526  if (parent_slot == &tree->ctl->root)
2527  {
2528  n4->base.count = 0;
2529  tree->ctl->start_shift = 0;
2530  tree->ctl->max_val = RT_SHIFT_GET_MAX_VAL(0);
2531  }
2532  else
2533  {
2534  /*
2535  * Deleting last entry, so just free the entire node.
2536  * RT_DELETE_RECURSIVE has already freed the value and lower-level
2537  * children.
2538  */
2539  RT_FREE_NODE(tree, node);
2540 
2541  /*
2542  * Also null out the parent's slot -- this tells the next higher
2543  * level to delete its child pointer
2544  */
2545  *parent_slot = RT_INVALID_PTR_ALLOC;
2546  }
2547  }
2548  else
2549  {
2550  int deletepos = slot - n4->children;
2551 
2552  Assert(deletepos >= 0);
2553  Assert(n4->chunks[deletepos] == chunk);
2554 
2556  n4->base.count, deletepos);
2557 
2558  n4->base.count--;
2559  }
2560 }
2561 
2562 /*
2563  * Delete the child pointer corresponding to "key" in the given node.
2564  */
2565 static inline void
2567 {
2568  switch ((node.local)->kind)
2569  {
2570  case RT_NODE_KIND_4:
2571  {
2572  RT_REMOVE_CHILD_4(tree, parent_slot, node, chunk, slot);
2573  return;
2574  }
2575  case RT_NODE_KIND_16:
2576  {
2577  RT_REMOVE_CHILD_16(tree, parent_slot, node, chunk, slot);
2578  return;
2579  }
2580  case RT_NODE_KIND_48:
2581  {
2582  RT_REMOVE_CHILD_48(tree, parent_slot, node, chunk);
2583  return;
2584  }
2585  case RT_NODE_KIND_256:
2586  {
2587  RT_REMOVE_CHILD_256(tree, parent_slot, node, chunk);
2588  return;
2589  }
2590  default:
2591  pg_unreachable();
2592  }
2593 }
2594 
2595 /* workhorse for RT_DELETE */
2596 static bool
2597 RT_DELETE_RECURSIVE(RT_RADIX_TREE * tree, RT_PTR_ALLOC * parent_slot, uint64 key, int shift)
2598 {
2599  RT_PTR_ALLOC *slot;
2600  RT_CHILD_PTR node;
2601  uint8 chunk = RT_GET_KEY_CHUNK(key, shift);
2602 
2603  node.alloc = *parent_slot;
2604  RT_PTR_SET_LOCAL(tree, &node);
2605  slot = RT_NODE_SEARCH(node.local, chunk);
2606 
2607  if (slot == NULL)
2608  return false;
2609 
2610  if (shift == 0)
2611  {
2612  if (!RT_CHILDPTR_IS_VALUE(*slot))
2613  RT_FREE_LEAF(tree, *slot);
2614 
2615  RT_NODE_DELETE(tree, parent_slot, node, chunk, slot);
2616  return true;
2617  }
2618  else
2619  {
2620  bool deleted;
2621 
2622  deleted = RT_DELETE_RECURSIVE(tree, slot, key, shift - RT_SPAN);
2623 
2624  /* Child node was freed, so delete its slot now */
2625  if (*slot == RT_INVALID_PTR_ALLOC)
2626  {
2627  Assert(deleted);
2628  RT_NODE_DELETE(tree, parent_slot, node, chunk, slot);
2629  }
2630 
2631  return deleted;
2632  }
2633 }
2634 
2635 /*
2636  * Delete the given key from the radix tree. If the key is found delete it
2637  * and return true, otherwise do nothing and return false.
2638  *
2639  * Taking a lock in exclusive mode is the caller's responsibility.
2640  */
2641 RT_SCOPE bool
2642 RT_DELETE(RT_RADIX_TREE * tree, uint64 key)
2643 {
2644  bool deleted;
2645 
2646 #ifdef RT_SHMEM
2647  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
2648 #endif
2649 
2650  if (key > tree->ctl->max_val)
2651  return false;
2652 
2653  Assert(RT_PTR_ALLOC_IS_VALID(tree->ctl->root));
2654  deleted = RT_DELETE_RECURSIVE(tree, &tree->ctl->root,
2655  key, tree->ctl->start_shift);
2656 
2657  /* Found the key to delete. Update the statistics */
2658  if (deleted)
2659  {
2660  tree->ctl->num_keys--;
2661  Assert(tree->ctl->num_keys >= 0);
2662  }
2663 
2664  return deleted;
2665 }
2666 
2667 #endif /* RT_USE_DELETE */
2668 
2669 /***************** UTILITY FUNCTIONS *****************/
2670 
2671 /*
2672  * Return the statistics of the amount of memory used by the radix tree.
2673  *
2674  * Since dsa_get_total_size() does appropriate locking, the caller doesn't
2675  * need to take a lock.
2676  */
2679 {
2680  size_t total = 0;
2681 
2682 #ifdef RT_SHMEM
2683  Assert(tree->ctl->magic == RT_RADIX_TREE_MAGIC);
2684  total = dsa_get_total_size(tree->dsa);
2685 #else
2686  total = MemoryContextMemAllocated(tree->context, true);
2687 #endif
2688 
2689  return total;
2690 }
2691 
2692 /*
2693  * Perform some sanity checks on the given node.
2694  */
2695 static void
2697 {
2698 #ifdef USE_ASSERT_CHECKING
2699 
2700  switch (node->kind)
2701  {
2702  case RT_NODE_KIND_4:
2703  {
2704  RT_NODE_4 *n4 = (RT_NODE_4 *) node;
2705 
2706  /* RT_DUMP_NODE(node); */
2707 
2708  for (int i = 1; i < n4->base.count; i++)
2709  Assert(n4->chunks[i - 1] < n4->chunks[i]);
2710 
2711  break;
2712  }
2713  case RT_NODE_KIND_16:
2714  {
2715  RT_NODE_16 *n16 = (RT_NODE_16 *) node;
2716 
2717  /* RT_DUMP_NODE(node); */
2718 
2719  for (int i = 1; i < n16->base.count; i++)
2720  Assert(n16->chunks[i - 1] < n16->chunks[i]);
2721 
2722  break;
2723  }
2724  case RT_NODE_KIND_48:
2725  {
2726  RT_NODE_48 *n48 = (RT_NODE_48 *) node;
2727  int cnt = 0;
2728 
2729  /* RT_DUMP_NODE(node); */
2730 
2731  for (int i = 0; i < RT_NODE_MAX_SLOTS; i++)
2732  {
2733  uint8 slot = n48->slot_idxs[i];
2734  int idx = RT_BM_IDX(slot);
2735  int bitnum = RT_BM_BIT(slot);
2736 
2737  if (!RT_NODE_48_IS_CHUNK_USED(n48, i))
2738  continue;
2739 
2740  /* Check if the corresponding slot is used */
2741  Assert(slot < node->fanout);
2742  Assert((n48->isset[idx] & ((bitmapword) 1 << bitnum)) != 0);
2743 
2744  cnt++;
2745  }
2746 
2747  Assert(n48->base.count == cnt);
2748 
2749  break;
2750  }
2751  case RT_NODE_KIND_256:
2752  {
2753  RT_NODE_256 *n256 = (RT_NODE_256 *) node;
2754  int cnt = 0;
2755 
2756  /* RT_DUMP_NODE(node); */
2757 
2758  for (int i = 0; i < RT_BM_IDX(RT_NODE_MAX_SLOTS); i++)
2759  cnt += bmw_popcount(n256->isset[i]);
2760 
2761  /*
2762  * Check if the number of used chunk matches, accounting for
2763  * overflow
2764  */
2765  if (cnt == RT_FANOUT_256)
2766  Assert(n256->base.count == 0);
2767  else
2768  Assert(n256->base.count == cnt);
2769 
2770  break;
2771  }
2772  }
2773 #endif
2774 }
2775 
2776 /***************** DEBUG FUNCTIONS *****************/
2777 
2778 #ifdef RT_DEBUG
2779 
2780 /*
2781  * Print out tree stats, some of which are only collected in debugging builds.
2782  */
2783 RT_SCOPE void
2785 {
2786  fprintf(stderr, "max_val = " UINT64_FORMAT "\n", tree->ctl->max_val);
2787  fprintf(stderr, "num_keys = %lld\n", (long long) tree->ctl->num_keys);
2788 
2789 #ifdef RT_SHMEM
2790  fprintf(stderr, "handle = " DSA_POINTER_FORMAT "\n", tree->ctl->handle);
2791 #endif
2792 
2793  fprintf(stderr, "height = %d", tree->ctl->start_shift / RT_SPAN);
2794 
2795  for (int i = 0; i < RT_NUM_SIZE_CLASSES; i++)
2796  {
2797  RT_SIZE_CLASS_ELEM size_class = RT_SIZE_CLASS_INFO[i];
2798 
2799  fprintf(stderr, ", n%d = %lld", size_class.fanout, (long long) tree->ctl->num_nodes[i]);
2800  }
2801 
2802  fprintf(stderr, ", leaves = %lld", (long long) tree->ctl->num_leaves);
2803 
2804  fprintf(stderr, "\n");
2805 }
2806 
2807 /*
2808  * Print out debugging information about the given node.
2809  */
2810 static void
2812 RT_DUMP_NODE(RT_NODE * node)
2813 {
2814 #ifdef RT_SHMEM
2815 #define RT_CHILD_PTR_FORMAT DSA_POINTER_FORMAT
2816 #else
2817 #define RT_CHILD_PTR_FORMAT "%p"
2818 #endif
2819 
2820  fprintf(stderr, "kind %d, fanout %d, count %u\n",
2821  (node->kind == RT_NODE_KIND_4) ? 4 :
2822  (node->kind == RT_NODE_KIND_16) ? 16 :
2823  (node->kind == RT_NODE_KIND_48) ? 48 : 256,
2824  node->fanout == 0 ? 256 : node->fanout,
2825  node->count == 0 ? 256 : node->count);
2826 
2827  switch (node->kind)
2828  {
2829  case RT_NODE_KIND_4:
2830  {
2831  RT_NODE_4 *n4 = (RT_NODE_4 *) node;
2832 
2833  fprintf(stderr, "chunks and slots:\n");
2834  for (int i = 0; i < n4->base.count; i++)
2835  {
2836  fprintf(stderr, " [%d] chunk %x slot " RT_CHILD_PTR_FORMAT "\n",
2837  i, n4->chunks[i], n4->children[i]);
2838  }
2839 
2840  break;
2841  }
2842  case RT_NODE_KIND_16:
2843  {
2844  RT_NODE_16 *n16 = (RT_NODE_16 *) node;
2845 
2846  fprintf(stderr, "chunks and slots:\n");
2847  for (int i = 0; i < n16->base.count; i++)
2848  {
2849  fprintf(stderr, " [%d] chunk %x slot " RT_CHILD_PTR_FORMAT "\n",
2850  i, n16->chunks[i], n16->children[i]);
2851  }
2852  break;
2853  }
2854  case RT_NODE_KIND_48:
2855  {
2856  RT_NODE_48 *n48 = (RT_NODE_48 *) node;
2857  char *sep = "";
2858 
2859  fprintf(stderr, "slot_idxs: \n");
2860  for (int chunk = 0; chunk < RT_NODE_MAX_SLOTS; chunk++)
2861  {
2862  if (!RT_NODE_48_IS_CHUNK_USED(n48, chunk))
2863  continue;
2864 
2865  fprintf(stderr, " idx[%d] = %d\n",
2866  chunk, n48->slot_idxs[chunk]);
2867  }
2868 
2869  fprintf(stderr, "isset-bitmap: ");
2870  for (int i = 0; i < (RT_FANOUT_48_MAX / BITS_PER_BYTE); i++)
2871  {
2872  fprintf(stderr, "%s%x", sep, ((uint8 *) n48->isset)[i]);
2873  sep = " ";
2874  }
2875  fprintf(stderr, "\n");
2876 
2877  fprintf(stderr, "chunks and slots:\n");
2878  for (int chunk = 0; chunk < RT_NODE_MAX_SLOTS; chunk++)
2879  {
2880  if (!RT_NODE_48_IS_CHUNK_USED(n48, chunk))
2881  continue;
2882 
2883  fprintf(stderr, " chunk %x slot " RT_CHILD_PTR_FORMAT "\n",
2884  chunk,
2885  *RT_NODE_48_GET_CHILD(n48, chunk));
2886  }
2887  break;
2888  }
2889  case RT_NODE_KIND_256:
2890  {
2891  RT_NODE_256 *n256 = (RT_NODE_256 *) node;
2892  char *sep = "";
2893 
2894  fprintf(stderr, "isset-bitmap: ");
2895  for (int i = 0; i < (RT_FANOUT_256 / BITS_PER_BYTE); i++)
2896  {
2897  fprintf(stderr, "%s%x", sep, ((uint8 *) n256->isset)[i]);
2898  sep = " ";
2899  }
2900  fprintf(stderr, "\n");
2901 
2902  fprintf(stderr, "chunks and slots:\n");
2903  for (int chunk = 0; chunk < RT_NODE_MAX_SLOTS; chunk++)
2904  {
2905  if (!RT_NODE_256_IS_CHUNK_USED(n256, chunk))
2906  continue;
2907 
2908  fprintf(stderr, " chunk %x slot " RT_CHILD_PTR_FORMAT "\n",
2909  chunk,
2910  *RT_NODE_256_GET_CHILD(n256, chunk));
2911  }
2912  break;
2913  }
2914  }
2915 }
2916 #endif /* RT_DEBUG */
2917 
2918 #endif /* RT_DEFINE */
2919 
2920 
2921 /* undefine external parameters, so next radix tree can be defined */
2922 #undef RT_PREFIX
2923 #undef RT_SCOPE
2924 #undef RT_DECLARE
2925 #undef RT_DEFINE
2926 #undef RT_VALUE_TYPE
2927 #undef RT_VARLEN_VALUE_SIZE
2928 #undef RT_RUNTIME_EMBEDDABLE_VALUE
2929 #undef RT_SHMEM
2930 #undef RT_USE_DELETE
2931 #undef RT_DEBUG
2932 
2933 /* locally declared macros */
2934 #undef RT_MAKE_PREFIX
2935 #undef RT_MAKE_NAME
2936 #undef RT_MAKE_NAME_
2937 #undef RT_STR
2938 #undef RT_STR_
2939 #undef RT_SPAN
2940 #undef RT_NODE_MAX_SLOTS
2941 #undef RT_CHUNK_MASK
2942 #undef RT_MAX_SHIFT
2943 #undef RT_MAX_LEVEL
2944 #undef RT_GET_KEY_CHUNK
2945 #undef RT_BM_IDX
2946 #undef RT_BM_BIT
2947 #undef RT_NODE_MUST_GROW
2948 #undef RT_NODE_KIND_COUNT
2949 #undef RT_NUM_SIZE_CLASSES
2950 #undef RT_INVALID_SLOT_IDX
2951 #undef RT_SLAB_BLOCK_SIZE
2952 #undef RT_RADIX_TREE_MAGIC
2953 #undef RT_CHILD_PTR_FORMAT
2954 
2955 /* type declarations */
2956 #undef RT_RADIX_TREE
2957 #undef RT_RADIX_TREE_CONTROL
2958 #undef RT_CHILD_PTR
2959 #undef RT_PTR_ALLOC
2960 #undef RT_INVALID_PTR_ALLOC
2961 #undef RT_HANDLE
2962 #undef RT_ITER
2963 #undef RT_NODE
2964 #undef RT_NODE_ITER
2965 #undef RT_NODE_KIND_4
2966 #undef RT_NODE_KIND_16
2967 #undef RT_NODE_KIND_48
2968 #undef RT_NODE_KIND_256
2969 #undef RT_NODE_4
2970 #undef RT_NODE_16
2971 #undef RT_NODE_48
2972 #undef RT_NODE_256
2973 #undef RT_SIZE_CLASS
2974 #undef RT_SIZE_CLASS_ELEM
2975 #undef RT_SIZE_CLASS_INFO
2976 #undef RT_CLASS_4
2977 #undef RT_CLASS_16_LO
2978 #undef RT_CLASS_16_HI
2979 #undef RT_CLASS_48
2980 #undef RT_CLASS_256
2981 #undef RT_FANOUT_4
2982 #undef RT_FANOUT_4_MAX
2983 #undef RT_FANOUT_16_LO
2984 #undef RT_FANOUT_16_HI
2985 #undef RT_FANOUT_16_MAX
2986 #undef RT_FANOUT_48
2987 #undef RT_FANOUT_48_MAX
2988 #undef RT_FANOUT_256
2989 
2990 /* function declarations */
2991 #undef RT_CREATE
2992 #undef RT_FREE
2993 #undef RT_ATTACH
2994 #undef RT_DETACH
2995 #undef RT_LOCK_EXCLUSIVE
2996 #undef RT_LOCK_SHARE
2997 #undef RT_UNLOCK
2998 #undef RT_GET_HANDLE
2999 #undef RT_FIND
3000 #undef RT_SET
3001 #undef RT_BEGIN_ITERATE
3002 #undef RT_ITERATE_NEXT
3003 #undef RT_END_ITERATE
3004 #undef RT_DELETE
3005 #undef RT_MEMORY_USAGE
3006 #undef RT_DUMP_NODE
3007 #undef RT_STATS
3008 
3009 /* internal helper functions */
3010 #undef RT_GET_VALUE_SIZE
3011 #undef RT_VALUE_IS_EMBEDDABLE
3012 #undef RT_CHILDPTR_IS_VALUE
3013 #undef RT_GET_SLOT_RECURSIVE
3014 #undef RT_DELETE_RECURSIVE
3015 #undef RT_ALLOC_NODE
3016 #undef RT_ALLOC_LEAF
3017 #undef RT_FREE_NODE
3018 #undef RT_FREE_LEAF
3019 #undef RT_FREE_RECURSE
3020 #undef RT_EXTEND_UP
3021 #undef RT_EXTEND_DOWN
3022 #undef RT_COPY_COMMON
3023 #undef RT_PTR_SET_LOCAL
3024 #undef RT_PTR_ALLOC_IS_VALID
3025 #undef RT_NODE_16_SEARCH_EQ
3026 #undef RT_NODE_4_GET_INSERTPOS
3027 #undef RT_NODE_16_GET_INSERTPOS
3028 #undef RT_SHIFT_ARRAYS_FOR_INSERT
3029 #undef RT_SHIFT_ARRAYS_AND_DELETE
3030 #undef RT_COPY_ARRAYS_FOR_INSERT
3031 #undef RT_COPY_ARRAYS_AND_DELETE
3032 #undef RT_NODE_48_IS_CHUNK_USED
3033 #undef RT_NODE_48_GET_CHILD
3034 #undef RT_NODE_256_IS_CHUNK_USED
3035 #undef RT_NODE_256_GET_CHILD
3036 #undef RT_KEY_GET_SHIFT
3037 #undef RT_SHIFT_GET_MAX_VAL
3038 #undef RT_NODE_SEARCH
3039 #undef RT_ADD_CHILD_4
3040 #undef RT_ADD_CHILD_16
3041 #undef RT_ADD_CHILD_48
3042 #undef RT_ADD_CHILD_256
3043 #undef RT_GROW_NODE_4
3044 #undef RT_GROW_NODE_16
3045 #undef RT_GROW_NODE_48
3046 #undef RT_REMOVE_CHILD_4
3047 #undef RT_REMOVE_CHILD_16
3048 #undef RT_REMOVE_CHILD_48
3049 #undef RT_REMOVE_CHILD_256
3050 #undef RT_SHRINK_NODE_16
3051 #undef RT_SHRINK_NODE_48
3052 #undef RT_SHRINK_NODE_256
3053 #undef RT_NODE_DELETE
3054 #undef RT_NODE_INSERT
3055 #undef RT_NODE_ITERATE_NEXT
3056 #undef RT_VERIFY_NODE
Datum idx(PG_FUNCTION_ARGS)
Definition: _int_op.c:259
#define bmw_rightmost_one_pos(w)
Definition: bitmapset.h:79
uint32 bitmapword
Definition: bitmapset.h:44
#define BITS_PER_BITMAPWORD
Definition: bitmapset.h:43
#define bmw_popcount(w)
Definition: bitmapset.h:80
#define pg_noinline
Definition: c.h:269
#define Min(x, y)
Definition: c.h:958
#define pg_attribute_unused()
Definition: c.h:130
uint8_t uint8
Definition: c.h:483
#define Assert(condition)
Definition: c.h:812
int64_t int64
Definition: c.h:482
#define FLEXIBLE_ARRAY_MEMBER
Definition: c.h:417
#define UINT64_FORMAT
Definition: c.h:504
uint64_t uint64
Definition: c.h:486
#define pg_unreachable()
Definition: c.h:315
#define unlikely(x)
Definition: c.h:330
uint32_t uint32
Definition: c.h:485
#define UINT64CONST(x)
Definition: c.h:500
size_t dsa_get_total_size(dsa_area *area)
Definition: dsa.c:1027
void * dsa_get_address(dsa_area *area, dsa_pointer dp)
Definition: dsa.c:942
void dsa_free(dsa_area *area, dsa_pointer dp)
Definition: dsa.c:826
#define dsa_allocate0(area, size)
Definition: dsa.h:113
uint64 dsa_pointer
Definition: dsa.h:62
#define DSA_POINTER_FORMAT
Definition: dsa.h:69
#define dsa_allocate(area, size)
Definition: dsa.h:109
uint64 chunk
int i
Definition: isn.c:72
if(TABLE==NULL||TABLE_index==NULL)
Definition: isn.c:76
bool LWLockAcquire(LWLock *lock, LWLockMode mode)
Definition: lwlock.c:1168
void LWLockRelease(LWLock *lock)
Definition: lwlock.c:1781
void LWLockInitialize(LWLock *lock, int tranche_id)
Definition: lwlock.c:707
@ LW_SHARED
Definition: lwlock.h:115
@ LW_EXCLUSIVE
Definition: lwlock.h:114
void MemoryContextReset(MemoryContext context)
Definition: mcxt.c:383
void pfree(void *pointer)
Definition: mcxt.c:1521
void * palloc0(Size size)
Definition: mcxt.c:1347
void * MemoryContextAllocZero(MemoryContext context, Size size)
Definition: mcxt.c:1215
void * MemoryContextAlloc(MemoryContext context, Size size)
Definition: mcxt.c:1181
Size MemoryContextMemAllocated(MemoryContext context, bool recurse)
Definition: mcxt.c:762
#define AllocSetContextCreate
Definition: memutils.h:129
#define ALLOCSET_SMALL_SIZES
Definition: memutils.h:170
static int pg_rightmost_one_pos32(uint32 word)
Definition: pg_bitutils.h:111
static int pg_leftmost_one_pos64(uint64 word)
Definition: pg_bitutils.h:72
#define BITS_PER_BYTE
#define fprintf
Definition: port.h:242
void check_stack_depth(void)
Definition: postgres.c:3572
#define RT_FIND
Definition: radixtree.h:177
#define RT_ADD_CHILD_48
Definition: radixtree.h:229
#define RT_NODE_KIND_16
Definition: radixtree.h:361
#define RT_NODE_ITERATE_NEXT
Definition: radixtree.h:241
#define RT_SPAN
Definition: radixtree.h:314
#define RT_CHILD_PTR
Definition: radixtree.h:252
#define RT_STATS
Definition: radixtree.h:195
#define RT_NODE_4_GET_INSERTPOS
Definition: radixtree.h:212
#define RT_NODE_ITER
Definition: radixtree.h:253
#define RT_ALLOC_LEAF
Definition: radixtree.h:203
#define RT_NODE_SEARCH
Definition: radixtree.h:224
#define RT_INVALID_SLOT_IDX
Definition: radixtree.h:557
#define RT_CHILDPTR_IS_VALUE
Definition: radixtree.h:198
#define RT_REMOVE_CHILD_16
Definition: radixtree.h:235
#define RT_VALUE_IS_EMBEDDABLE
Definition: radixtree.h:199
#define RT_ALLOC_NODE
Definition: radixtree.h:202
RT_CHILD_PTR rootnode
Definition: radixtree.h:1829
#define RT_FREE_RECURSE
Definition: radixtree.h:206
#define RT_NODE_16_GET_INSERTPOS
Definition: radixtree.h:213
#define RT_NODE_48_GET_CHILD
Definition: radixtree.h:219
#define RT_ITERATE_NEXT
Definition: radixtree.h:188
tree ctl start_shift
Definition: radixtree.h:1889
#define RT_END_ITERATE
Definition: radixtree.h:189
#define RT_BEGIN_ITERATE
Definition: radixtree.h:187
tree
Definition: radixtree.h:1836
#define RT_GROW_NODE_4
Definition: radixtree.h:231
#define RT_NODE_MAX_SLOTS
Definition: radixtree.h:320
RT_SCOPE RT_RADIX_TREE *MemoryContext old_ctx
Definition: radixtree.h:1826
#define RT_PTR_SET_LOCAL
Definition: radixtree.h:210
#define RT_COPY_ARRAYS_FOR_INSERT
Definition: radixtree.h:216
#define RT_DELETE_RECURSIVE
Definition: radixtree.h:201
#define RT_STR(s)
Definition: radixtree.h:171
#define RT_GET_SLOT_RECURSIVE
Definition: radixtree.h:200
for(int i=0;i< RT_NUM_SIZE_CLASSES;i++)
Definition: radixtree.h:1858
#define RT_SHRINK_NODE_256
Definition: radixtree.h:240
#define RT_SHRINK_NODE_48
Definition: radixtree.h:239
#define RT_EXTEND_DOWN
Definition: radixtree.h:208
#define RT_ADD_CHILD_16
Definition: radixtree.h:228
#define RT_SIZE_CLASS_ELEM
Definition: radixtree.h:259
#define RT_NODE_KIND_48
Definition: radixtree.h:362
MemoryContextSwitchTo(old_ctx)
#define RT_NODE_KIND_256
Definition: radixtree.h:363
#define RT_NODE
Definition: radixtree.h:251
#define RT_CREATE
Definition: radixtree.h:175
#define RT_REMOVE_CHILD_256
Definition: radixtree.h:237
#define RT_NODE_4
Definition: radixtree.h:254
#define RT_NODE_256
Definition: radixtree.h:257
#define RT_SIZE_CLASS
Definition: radixtree.h:258
#define RT_RADIX_TREE_CONTROL
Definition: radixtree.h:246
#define RT_NODE_MUST_GROW(node)
Definition: radixtree.h:1134
#define RT_COPY_COMMON
Definition: radixtree.h:209
#define RT_FREE
Definition: radixtree.h:176
#define RT_NODE_48
Definition: radixtree.h:256
#define RT_SLAB_BLOCK_SIZE(size)
Definition: radixtree.h:370
#define RT_FREE_NODE
Definition: radixtree.h:204
#define RT_CLASS_16_LO
Definition: radixtree.h:262
#define RT_MAX_SHIFT
Definition: radixtree.h:326
#define RT_SHIFT_GET_MAX_VAL
Definition: radixtree.h:223
#define RT_MEMORY_USAGE
Definition: radixtree.h:193
#define RT_NODE_48_IS_CHUNK_USED
Definition: radixtree.h:218
#define RT_BM_BIT(x)
Definition: radixtree.h:336
#define RT_CHUNK_MASK
Definition: radixtree.h:323
#define RT_FANOUT_256
Definition: radixtree.h:506
#define RT_COPY_ARRAYS_AND_DELETE
Definition: radixtree.h:217
#define RT_CLASS_256
Definition: radixtree.h:265
#define RT_ADD_CHILD_4
Definition: radixtree.h:227
#define RT_REMOVE_CHILD_48
Definition: radixtree.h:236
#define RT_DUMP_NODE
Definition: radixtree.h:194
#define RT_ADD_CHILD_256
Definition: radixtree.h:230
#define RT_KEY_GET_SHIFT
Definition: radixtree.h:222
#define RT_SHIFT_ARRAYS_AND_DELETE
Definition: radixtree.h:215
#define RT_NODE_DELETE
Definition: radixtree.h:225
RT_SIZE_CLASS
Definition: radixtree.h:633
#define RT_NODE_16
Definition: radixtree.h:255
#define RT_NODE_INSERT
Definition: radixtree.h:226
tree ctl root
Definition: radixtree.h:1888
#define RT_MAX_LEVEL
Definition: radixtree.h:329
#define RT_GROW_NODE_16
Definition: radixtree.h:232
#define RT_CLASS_16_HI
Definition: radixtree.h:263
#define RT_NODE_256_GET_CHILD
Definition: radixtree.h:221
StaticAssertDecl(RT_FANOUT_4<=RT_FANOUT_4_MAX, "watch struct padding")
#define RT_SHRINK_NODE_16
Definition: radixtree.h:238
#define RT_NODE_KIND_4
Definition: radixtree.h:360
#define RT_PTR_ALLOC_IS_VALID(ptr)
Definition: radixtree.h:406
#define RT_GET_KEY_CHUNK(key, shift)
Definition: radixtree.h:332
#define RT_VERIFY_NODE
Definition: radixtree.h:242
#define RT_FANOUT_16_HI
Definition: radixtree.h:602
#define RT_GET_VALUE_SIZE(v)
Definition: radixtree.h:433
#define RT_GROW_NODE_48
Definition: radixtree.h:233
#define RT_PTR_ALLOC
Definition: radixtree.h:404
#define RT_NODE_256_IS_CHUNK_USED
Definition: radixtree.h:220
#define RT_FANOUT_16_MAX
Definition: radixtree.h:494
#define RT_SHIFT_ARRAYS_FOR_INSERT
Definition: radixtree.h:214
#define RT_FANOUT_48
Definition: radixtree.h:603
#define RT_NODE_16_SEARCH_EQ
Definition: radixtree.h:211
#define RT_BM_IDX(x)
Definition: radixtree.h:335
#define RT_SIZE_CLASS_INFO
Definition: radixtree.h:260
#define RT_EXTEND_UP
Definition: radixtree.h:207
#define RT_FANOUT_48_MAX
Definition: radixtree.h:504
#define RT_SET
Definition: radixtree.h:186
#define RT_FREE_LEAF
Definition: radixtree.h:205
#define RT_FANOUT_4_MAX
Definition: radixtree.h:491
#define RT_FANOUT_4
Definition: radixtree.h:613
#define RT_CLASS_48
Definition: radixtree.h:264
#define RT_FANOUT_16_LO
Definition: radixtree.h:600
#define RT_INVALID_PTR_ALLOC
Definition: radixtree.h:405
#define RT_REMOVE_CHILD_4
Definition: radixtree.h:234
#define RT_NUM_SIZE_CLASSES
Definition: radixtree.h:678
#define RT_CLASS_4
Definition: radixtree.h:261
static Vector8 vector8_broadcast(const uint8 c)
Definition: simd.h:135
static void vector8_load(Vector8 *v, const uint8 *s)
Definition: simd.h:108
uint64 Vector8
Definition: simd.h:60
MemoryContext SlabContextCreate(MemoryContext parent, const char *name, Size blockSize, Size chunkSize)
Definition: slab.c:322
Definition: lwlock.h:42
RT_NODE_ITER node_iters[RT_MAX_LEVEL]
Definition: radixtree.h:754
int cur_level
Definition: radixtree.h:756
RT_RADIX_TREE * tree
Definition: radixtree.h:748
uint64 key
Definition: radixtree.h:759
int top_level
Definition: radixtree.h:755
RT_NODE base
Definition: radixtree.h:530
RT_PTR_ALLOC children[FLEXIBLE_ARRAY_MEMBER]
Definition: radixtree.h:535
uint8 chunks[RT_FANOUT_16_MAX]
Definition: radixtree.h:532
RT_PTR_ALLOC children[RT_FANOUT_256]
Definition: radixtree.h:576
RT_NODE base
Definition: radixtree.h:570
bitmapword isset[RT_BM_IDX(RT_FANOUT_256)]
Definition: radixtree.h:573
RT_PTR_ALLOC children[FLEXIBLE_ARRAY_MEMBER]
Definition: radixtree.h:560
uint8 slot_idxs[RT_NODE_MAX_SLOTS]
Definition: radixtree.h:554
RT_NODE base
Definition: radixtree.h:544
bitmapword isset[RT_BM_IDX(RT_FANOUT_48_MAX)]
Definition: radixtree.h:547
RT_PTR_ALLOC children[FLEXIBLE_ARRAY_MEMBER]
Definition: radixtree.h:525
RT_NODE base
Definition: radixtree.h:520
uint8 chunks[RT_FANOUT_4_MAX]
Definition: radixtree.h:522
RT_CHILD_PTR node
Definition: radixtree.h:735
uint8 kind
Definition: radixtree.h:377
uint8 count
Definition: radixtree.h:394
uint8 fanout
Definition: radixtree.h:386
RT_PTR_ALLOC root
Definition: radixtree.h:694
MemoryContext context
Definition: radixtree.h:709
MemoryContextData * node_slabs[RT_NUM_SIZE_CLASSES]
Definition: radixtree.h:717
MemoryContextData * leaf_context
Definition: radixtree.h:720
MemoryContextData * iter_context
Definition: radixtree.h:722
RT_RADIX_TREE_CONTROL * ctl
Definition: radixtree.h:712
const char * name
Definition: radixtree.h:644
Definition: dsa.c:348
Definition: type.h:96
#define RT_VALUE_TYPE
Definition: tidstore.c:106
#define RT_SCOPE
Definition: tidstore.c:103
#define RT_PREFIX
Definition: tidstore.c:101
RT_NODE * local
Definition: radixtree.h:421
RT_PTR_ALLOC alloc
Definition: radixtree.h:420
Datum bit(PG_FUNCTION_ARGS)
Definition: varbit.c:391