PostgreSQL Source Code  git master
brin_bloom.c
Go to the documentation of this file.
1 /*
2  * brin_bloom.c
3  * Implementation of Bloom opclass for BRIN
4  *
5  * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
6  * Portions Copyright (c) 1994, Regents of the University of California
7  *
8  *
9  * A BRIN opclass summarizing page range into a bloom filter.
10  *
11  * Bloom filters allow efficient testing whether a given page range contains
12  * a particular value. Therefore, if we summarize each page range into a small
13  * bloom filter, we can easily (and cheaply) test whether it contains values
14  * we get later.
15  *
16  * The index only supports equality operators, similarly to hash indexes.
17  * Bloom indexes are however much smaller, and support only bitmap scans.
18  *
19  * Note: Don't confuse this with bloom indexes, implemented in a contrib
20  * module. That extension implements an entirely new AM, building a bloom
21  * filter on multiple columns in a single row. This opclass works with an
22  * existing AM (BRIN) and builds bloom filter on a column.
23  *
24  *
25  * values vs. hashes
26  * -----------------
27  *
28  * The original column values are not used directly, but are first hashed
29  * using the regular type-specific hash function, producing a uint32 hash.
30  * And this hash value is then added to the summary - i.e. it's hashed
31  * again and added to the bloom filter.
32  *
33  * This allows the code to treat all data types (byval/byref/...) the same
34  * way, with only minimal space requirements, because we're working with
35  * hashes and not the original values. Everything is uint32.
36  *
37  * Of course, this assumes the built-in hash function is reasonably good,
38  * without too many collisions etc. But that does seem to be the case, at
39  * least based on past experience. After all, the same hash functions are
40  * used for hash indexes, hash partitioning and so on.
41  *
42  *
43  * hashing scheme
44  * --------------
45  *
46  * Bloom filters require a number of independent hash functions. There are
47  * different schemes how to construct them - for example we might use
48  * hash_uint32_extended with random seeds, but that seems fairly expensive.
49  * We use a scheme requiring only two functions described in this paper:
50  *
51  * Less Hashing, Same Performance:Building a Better Bloom Filter
52  * Adam Kirsch, Michael Mitzenmacher, Harvard School of Engineering and
53  * Applied Sciences, Cambridge, Massachusetts [DOI 10.1002/rsa.20208]
54  *
55  * The two hash functions h1 and h2 are calculated using hard-coded seeds,
56  * and then combined using (h1 + i * h2) to generate the hash functions.
57  *
58  *
59  * sizing the bloom filter
60  * -----------------------
61  *
62  * Size of a bloom filter depends on the number of distinct values we will
63  * store in it, and the desired false positive rate. The higher the number
64  * of distinct values and/or the lower the false positive rate, the larger
65  * the bloom filter. On the other hand, we want to keep the index as small
66  * as possible - that's one of the basic advantages of BRIN indexes.
67  *
68  * Although the number of distinct elements (in a page range) depends on
69  * the data, we can consider it fixed. This simplifies the trade-off to
70  * just false positive rate vs. size.
71  *
72  * At the page range level, false positive rate is a probability the bloom
73  * filter matches a random value. For the whole index (with sufficiently
74  * many page ranges) it represents the fraction of the index ranges (and
75  * thus fraction of the table to be scanned) matching the random value.
76  *
77  * Furthermore, the size of the bloom filter is subject to implementation
78  * limits - it has to fit onto a single index page (8kB by default). As
79  * the bitmap is inherently random (when "full" about half the bits is set
80  * to 1, randomly), compression can't help very much.
81  *
82  * To reduce the size of a filter (to fit to a page), we have to either
83  * accept higher false positive rate (undesirable), or reduce the number
84  * of distinct items to be stored in the filter. We can't alter the input
85  * data, of course, but we may make the BRIN page ranges smaller - instead
86  * of the default 128 pages (1MB) we may build index with 16-page ranges,
87  * or something like that. This should reduce the number of distinct values
88  * in the page range, making the filter smaller (with fixed false positive
89  * rate). Even for random data sets this should help, as the number of rows
90  * per heap page is limited (to ~290 with very narrow tables, likely ~20
91  * in practice).
92  *
93  * Of course, good sizing decisions depend on having the necessary data,
94  * i.e. number of distinct values in a page range (of a given size) and
95  * table size (to estimate cost change due to change in false positive
96  * rate due to having larger index vs. scanning larger indexes). We may
97  * not have that data - for example when building an index on empty table
98  * it's not really possible. And for some data we only have estimates for
99  * the whole table and we can only estimate per-range values (ndistinct).
100  *
101  * Another challenge is that while the bloom filter is per-column, it's
102  * the whole index tuple that has to fit into a page. And for multi-column
103  * indexes that may include pieces we have no control over (not necessarily
104  * bloom filters, the other columns may use other BRIN opclasses). So it's
105  * not entirely clear how to distribute the space between those columns.
106  *
107  * The current logic, implemented in brin_bloom_get_ndistinct, attempts to
108  * make some basic sizing decisions, based on the size of BRIN ranges, and
109  * the maximum number of rows per range.
110  *
111  *
112  * IDENTIFICATION
113  * src/backend/access/brin/brin_bloom.c
114  */
115 #include "postgres.h"
116 
117 #include "access/genam.h"
118 #include "access/brin.h"
119 #include "access/brin_internal.h"
120 #include "access/brin_page.h"
121 #include "access/brin_tuple.h"
122 #include "access/hash.h"
123 #include "access/htup_details.h"
124 #include "access/reloptions.h"
125 #include "access/stratnum.h"
126 #include "catalog/pg_type.h"
127 #include "catalog/pg_amop.h"
128 #include "utils/builtins.h"
129 #include "utils/datum.h"
130 #include "utils/lsyscache.h"
131 #include "utils/rel.h"
132 #include "utils/syscache.h"
133 
134 #include <math.h>
135 
136 #define BloomEqualStrategyNumber 1
137 
138 /*
139  * Additional SQL level support functions. We only have one, which is
140  * used to calculate hash of the input value.
141  *
142  * Procedure numbers must not use values reserved for BRIN itself; see
143  * brin_internal.h.
144  */
145 #define BLOOM_MAX_PROCNUMS 1 /* maximum support procs we need */
146 #define PROCNUM_HASH 11 /* required */
147 
148 /*
149  * Subtract this from procnum to obtain index in BloomOpaque arrays
150  * (Must be equal to minimum of private procnums).
151  */
152 #define PROCNUM_BASE 11
153 
154 /*
155  * Storage type for BRIN's reloptions.
156  */
157 typedef struct BloomOptions
158 {
159  int32 vl_len_; /* varlena header (do not touch directly!) */
160  double nDistinctPerRange; /* number of distinct values per range */
161  double falsePositiveRate; /* false positive for bloom filter */
163 
164 /*
165  * The current min value (16) is somewhat arbitrary, but it's based
166  * on the fact that the filter header is ~20B alone, which is about
167  * the same as the filter bitmap for 16 distinct items with 1% false
168  * positive rate. So by allowing lower values we'd not gain much. In
169  * any case, the min should not be larger than MaxHeapTuplesPerPage
170  * (~290), which is the theoretical maximum for single-page ranges.
171  */
172 #define BLOOM_MIN_NDISTINCT_PER_RANGE 16
173 
174 /*
175  * Used to determine number of distinct items, based on the number of rows
176  * in a page range. The 10% is somewhat similar to what estimate_num_groups
177  * does, so we use the same factor here.
178  */
179 #define BLOOM_DEFAULT_NDISTINCT_PER_RANGE -0.1 /* 10% of values */
180 
181 /*
182  * Allowed range and default value for the false positive range. The exact
183  * values are somewhat arbitrary, but were chosen considering the various
184  * parameters (size of filter vs. page size, etc.).
185  *
186  * The lower the false-positive rate, the more accurate the filter is, but
187  * it also gets larger - at some point this eliminates the main advantage
188  * of BRIN indexes, which is the tiny size. At 0.01% the index is about
189  * 10% of the table (assuming 290 distinct values per 8kB page).
190  *
191  * On the other hand, as the false-positive rate increases, larger part of
192  * the table has to be scanned due to mismatches - at 25% we're probably
193  * close to sequential scan being cheaper.
194  */
195 #define BLOOM_MIN_FALSE_POSITIVE_RATE 0.0001 /* 0.01% fp rate */
196 #define BLOOM_MAX_FALSE_POSITIVE_RATE 0.25 /* 25% fp rate */
197 #define BLOOM_DEFAULT_FALSE_POSITIVE_RATE 0.01 /* 1% fp rate */
198 
199 #define BloomGetNDistinctPerRange(opts) \
200  ((opts) && (((BloomOptions *) (opts))->nDistinctPerRange != 0) ? \
201  (((BloomOptions *) (opts))->nDistinctPerRange) : \
202  BLOOM_DEFAULT_NDISTINCT_PER_RANGE)
203 
204 #define BloomGetFalsePositiveRate(opts) \
205  ((opts) && (((BloomOptions *) (opts))->falsePositiveRate != 0.0) ? \
206  (((BloomOptions *) (opts))->falsePositiveRate) : \
207  BLOOM_DEFAULT_FALSE_POSITIVE_RATE)
208 
209 /*
210  * And estimate of the largest bloom we can fit onto a page. This is not
211  * a perfect guarantee, for a couple of reasons. For example, the row may
212  * be larger because the index has multiple columns.
213  */
214 #define BloomMaxFilterSize \
215  MAXALIGN_DOWN(BLCKSZ - \
216  (MAXALIGN(SizeOfPageHeaderData + \
217  sizeof(ItemIdData)) + \
218  MAXALIGN(sizeof(BrinSpecialSpace)) + \
219  SizeOfBrinTuple))
220 
221 /*
222  * Seeds used to calculate two hash functions h1 and h2, which are then used
223  * to generate k hashes using the (h1 + i * h2) scheme.
224  */
225 #define BLOOM_SEED_1 0x71d924af
226 #define BLOOM_SEED_2 0xba48b314
227 
228 /*
229  * Bloom Filter
230  *
231  * Represents a bloom filter, built on hashes of the indexed values. That is,
232  * we compute a uint32 hash of the value, and then store this hash into the
233  * bloom filter (and compute additional hashes on it).
234  *
235  * XXX We could implement "sparse" bloom filters, keeping only the bytes that
236  * are not entirely 0. But while indexes don't support TOAST, the varlena can
237  * still be compressed. So this seems unnecessary, because the compression
238  * should do the same job.
239  *
240  * XXX We can also watch the number of bits set in the bloom filter, and then
241  * stop using it (and not store the bitmap, to save space) when the false
242  * positive rate gets too high. But even if the false positive rate exceeds the
243  * desired value, it still can eliminate some page ranges.
244  */
245 typedef struct BloomFilter
246 {
247  /* varlena header (do not touch directly!) */
249 
250  /* space for various flags (unused for now) */
252 
253  /* fields for the HASHED phase */
254  uint8 nhashes; /* number of hash functions */
255  uint32 nbits; /* number of bits in the bitmap (size) */
256  uint32 nbits_set; /* number of bits set to 1 */
257 
258  /* data of the bloom filter */
261 
262 /*
263  * bloom_filter_size
264  * Calculate Bloom filter parameters (nbits, nbytes, nhashes).
265  *
266  * Given expected number of distinct values and desired false positive rate,
267  * calculates the optimal parameters of the Bloom filter.
268  *
269  * The resulting parameters are returned through nbytesp (number of bytes),
270  * nbitsp (number of bits) and nhashesp (number of hash functions). If a
271  * pointer is NULL, the parameter is not returned.
272  */
273 static void
274 bloom_filter_size(int ndistinct, double false_positive_rate,
275  int *nbytesp, int *nbitsp, int *nhashesp)
276 {
277  double k;
278  int nbits,
279  nbytes;
280 
281  /* sizing bloom filter: -(n * ln(p)) / (ln(2))^2 */
282  nbits = ceil(-(ndistinct * log(false_positive_rate)) / pow(log(2.0), 2));
283 
284  /* round m to whole bytes */
285  nbytes = ((nbits + 7) / 8);
286  nbits = nbytes * 8;
287 
288  /*
289  * round(log(2.0) * m / ndistinct), but assume round() may not be
290  * available on Windows
291  */
292  k = log(2.0) * nbits / ndistinct;
293  k = (k - floor(k) >= 0.5) ? ceil(k) : floor(k);
294 
295  if (nbytesp)
296  *nbytesp = nbytes;
297 
298  if (nbitsp)
299  *nbitsp = nbits;
300 
301  if (nhashesp)
302  *nhashesp = (int) k;
303 }
304 
305 /*
306  * bloom_init
307  * Initialize the Bloom Filter, allocate all the memory.
308  *
309  * The filter is initialized with optimal size for ndistinct expected values
310  * and the requested false positive rate. The filter is stored as varlena.
311  */
312 static BloomFilter *
313 bloom_init(int ndistinct, double false_positive_rate)
314 {
315  Size len;
316  BloomFilter *filter;
317 
318  int nbits; /* size of filter / number of bits */
319  int nbytes; /* size of filter / number of bytes */
320  int nhashes; /* number of hash functions */
321 
322  Assert(ndistinct > 0);
323  Assert(false_positive_rate > 0 && false_positive_rate < 1);
324 
325  /* calculate bloom filter size / parameters */
326  bloom_filter_size(ndistinct, false_positive_rate,
327  &nbytes, &nbits, &nhashes);
328 
329  /*
330  * Reject filters that are obviously too large to store on a page.
331  *
332  * Initially the bloom filter is just zeroes and so very compressible, but
333  * as we add values it gets more and more random, and so less and less
334  * compressible. So initially everything fits on the page, but we might
335  * get surprising failures later - we want to prevent that, so we reject
336  * bloom filter that are obviously too large.
337  *
338  * XXX It's not uncommon to oversize the bloom filter a bit, to defend
339  * against unexpected data anomalies (parts of table with more distinct
340  * values per range etc.). But we still need to make sure even the
341  * oversized filter fits on page, if such need arises.
342  *
343  * XXX This check is not perfect, because the index may have multiple
344  * filters that are small individually, but too large when combined.
345  */
346  if (nbytes > BloomMaxFilterSize)
347  elog(ERROR, "the bloom filter is too large (%d > %zu)", nbytes,
349 
350  /*
351  * We allocate the whole filter. Most of it is going to be 0 bits, so the
352  * varlena is easy to compress.
353  */
354  len = offsetof(BloomFilter, data) + nbytes;
355 
356  filter = (BloomFilter *) palloc0(len);
357 
358  filter->flags = 0;
359  filter->nhashes = nhashes;
360  filter->nbits = nbits;
361 
362  SET_VARSIZE(filter, len);
363 
364  return filter;
365 }
366 
367 
368 /*
369  * bloom_add_value
370  * Add value to the bloom filter.
371  */
372 static BloomFilter *
373 bloom_add_value(BloomFilter *filter, uint32 value, bool *updated)
374 {
375  int i;
376  uint64 h1,
377  h2;
378 
379  /* compute the hashes, used for the bloom filter */
382 
383  /* compute the requested number of hashes */
384  for (i = 0; i < filter->nhashes; i++)
385  {
386  /* h1 + h2 + f(i) */
387  uint32 h = (h1 + i * h2) % filter->nbits;
388  uint32 byte = (h / 8);
389  uint32 bit = (h % 8);
390 
391  /* if the bit is not set, set it and remember we did that */
392  if (!(filter->data[byte] & (0x01 << bit)))
393  {
394  filter->data[byte] |= (0x01 << bit);
395  filter->nbits_set++;
396  if (updated)
397  *updated = true;
398  }
399  }
400 
401  return filter;
402 }
403 
404 
405 /*
406  * bloom_contains_value
407  * Check if the bloom filter contains a particular value.
408  */
409 static bool
411 {
412  int i;
413  uint64 h1,
414  h2;
415 
416  /* calculate the two hashes */
419 
420  /* compute the requested number of hashes */
421  for (i = 0; i < filter->nhashes; i++)
422  {
423  /* h1 + h2 + f(i) */
424  uint32 h = (h1 + i * h2) % filter->nbits;
425  uint32 byte = (h / 8);
426  uint32 bit = (h % 8);
427 
428  /* if the bit is not set, the value is not there */
429  if (!(filter->data[byte] & (0x01 << bit)))
430  return false;
431  }
432 
433  /* all hashes found in bloom filter */
434  return true;
435 }
436 
437 typedef struct BloomOpaque
438 {
439  /*
440  * XXX At this point we only need a single proc (to compute the hash), but
441  * let's keep the array just like inclusion and minmax opclasses, for
442  * consistency. We may need additional procs in the future.
443  */
447 
448 static FmgrInfo *bloom_get_procinfo(BrinDesc *bdesc, uint16 attno,
449  uint16 procnum);
450 
451 
452 Datum
454 {
455  BrinOpcInfo *result;
456 
457  /*
458  * opaque->strategy_procinfos is initialized lazily; here it is set to
459  * all-uninitialized by palloc0 which sets fn_oid to InvalidOid.
460  *
461  * bloom indexes only store the filter as a single BYTEA column
462  */
463 
464  result = palloc0(MAXALIGN(SizeofBrinOpcInfo(1)) +
465  sizeof(BloomOpaque));
466  result->oi_nstored = 1;
467  result->oi_regular_nulls = true;
468  result->oi_opaque = (BloomOpaque *)
469  MAXALIGN((char *) result + SizeofBrinOpcInfo(1));
470  result->oi_typcache[0] = lookup_type_cache(PG_BRIN_BLOOM_SUMMARYOID, 0);
471 
472  PG_RETURN_POINTER(result);
473 }
474 
475 /*
476  * brin_bloom_get_ndistinct
477  * Determine the ndistinct value used to size bloom filter.
478  *
479  * Adjust the ndistinct value based on the pagesPerRange value. First,
480  * if it's negative, it's assumed to be relative to maximum number of
481  * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
482  * tuples, which is likely a significant over-estimate). We also clamp
483  * the value, not to over-size the bloom filter unnecessarily.
484  *
485  * XXX We can only do this when the pagesPerRange value was supplied.
486  * If it wasn't, it has to be a read-only access to the index, in which
487  * case we don't really care. But perhaps we should fall-back to the
488  * default pagesPerRange value?
489  *
490  * XXX We might also fetch info about ndistinct estimate for the column,
491  * and compute the expected number of distinct values in a range. But
492  * that may be tricky due to data being sorted in various ways, so it
493  * seems better to rely on the upper estimate.
494  *
495  * XXX We might also calculate a better estimate of rows per BRIN range,
496  * instead of using MaxHeapTuplesPerPage (which probably produces values
497  * much higher than reality).
498  */
499 static int
501 {
502  double ndistinct;
503  double maxtuples;
504  BlockNumber pagesPerRange;
505 
506  pagesPerRange = BrinGetPagesPerRange(bdesc->bd_index);
507  ndistinct = BloomGetNDistinctPerRange(opts);
508 
509  Assert(BlockNumberIsValid(pagesPerRange));
510 
511  maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
512 
513  /*
514  * Similarly to n_distinct, negative values are relative - in this case to
515  * maximum number of tuples in the page range (maxtuples).
516  */
517  if (ndistinct < 0)
518  ndistinct = (-ndistinct) * maxtuples;
519 
520  /*
521  * Positive values are to be used directly, but we still apply a couple of
522  * safeties to avoid using unreasonably small bloom filters.
523  */
524  ndistinct = Max(ndistinct, BLOOM_MIN_NDISTINCT_PER_RANGE);
525 
526  /*
527  * And don't use more than the maximum possible number of tuples, in the
528  * range, which would be entirely wasteful.
529  */
530  ndistinct = Min(ndistinct, maxtuples);
531 
532  return (int) ndistinct;
533 }
534 
535 /*
536  * Examine the given index tuple (which contains partial status of a certain
537  * page range) by comparing it to the given value that comes from another heap
538  * tuple. If the new value is outside the bloom filter specified by the
539  * existing tuple values, update the index tuple and return true. Otherwise,
540  * return false and do not modify in this case.
541  */
542 Datum
544 {
545  BrinDesc *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
546  BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
548  bool isnull PG_USED_FOR_ASSERTS_ONLY = PG_GETARG_DATUM(3);
550  Oid colloid = PG_GET_COLLATION();
551  FmgrInfo *hashFn;
552  uint32 hashValue;
553  bool updated = false;
554  AttrNumber attno;
555  BloomFilter *filter;
556 
557  Assert(!isnull);
558 
559  attno = column->bv_attno;
560 
561  /*
562  * If this is the first non-null value, we need to initialize the bloom
563  * filter. Otherwise just extract the existing bloom filter from
564  * BrinValues.
565  */
566  if (column->bv_allnulls)
567  {
568  filter = bloom_init(brin_bloom_get_ndistinct(bdesc, opts),
570  column->bv_values[0] = PointerGetDatum(filter);
571  column->bv_allnulls = false;
572  updated = true;
573  }
574  else
575  filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
576 
577  /*
578  * Compute the hash of the new value, using the supplied hash function,
579  * and then add the hash value to the bloom filter.
580  */
581  hashFn = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
582 
583  hashValue = DatumGetUInt32(FunctionCall1Coll(hashFn, colloid, newval));
584 
585  filter = bloom_add_value(filter, hashValue, &updated);
586 
587  column->bv_values[0] = PointerGetDatum(filter);
588 
589  PG_RETURN_BOOL(updated);
590 }
591 
592 /*
593  * Given an index tuple corresponding to a certain page range and a scan key,
594  * return whether the scan key is consistent with the index tuple's bloom
595  * filter. Return true if so, false otherwise.
596  */
597 Datum
599 {
600  BrinDesc *bdesc = (BrinDesc *) PG_GETARG_POINTER(0);
601  BrinValues *column = (BrinValues *) PG_GETARG_POINTER(1);
602  ScanKey *keys = (ScanKey *) PG_GETARG_POINTER(2);
603  int nkeys = PG_GETARG_INT32(3);
604  Oid colloid = PG_GET_COLLATION();
605  AttrNumber attno;
606  Datum value;
607  bool matches;
608  FmgrInfo *finfo;
609  uint32 hashValue;
610  BloomFilter *filter;
611  int keyno;
612 
613  filter = (BloomFilter *) PG_DETOAST_DATUM(column->bv_values[0]);
614 
615  Assert(filter);
616 
617  /*
618  * Assume all scan keys match. We'll be searching for a scan key
619  * eliminating the page range (we can stop on the first such key).
620  */
621  matches = true;
622 
623  for (keyno = 0; keyno < nkeys; keyno++)
624  {
625  ScanKey key = keys[keyno];
626 
627  /* NULL keys are handled and filtered-out in bringetbitmap */
628  Assert(!(key->sk_flags & SK_ISNULL));
629 
630  attno = key->sk_attno;
631  value = key->sk_argument;
632 
633  switch (key->sk_strategy)
634  {
636 
637  /*
638  * We want to return the current page range if the bloom
639  * filter seems to contain the value.
640  */
641  finfo = bloom_get_procinfo(bdesc, attno, PROCNUM_HASH);
642 
643  hashValue = DatumGetUInt32(FunctionCall1Coll(finfo, colloid, value));
644  matches &= bloom_contains_value(filter, hashValue);
645 
646  break;
647  default:
648  /* shouldn't happen */
649  elog(ERROR, "invalid strategy number %d", key->sk_strategy);
650  matches = false;
651  break;
652  }
653 
654  if (!matches)
655  break;
656  }
657 
658  PG_RETURN_BOOL(matches);
659 }
660 
661 /*
662  * Given two BrinValues, update the first of them as a union of the summary
663  * values contained in both. The second one is untouched.
664  *
665  * XXX We assume the bloom filters have the same parameters for now. In the
666  * future we should have 'can union' function, to decide if we can combine
667  * two particular bloom filters.
668  */
669 Datum
671 {
672  int i;
673  int nbytes;
674  BrinValues *col_a = (BrinValues *) PG_GETARG_POINTER(1);
675  BrinValues *col_b = (BrinValues *) PG_GETARG_POINTER(2);
676  BloomFilter *filter_a;
677  BloomFilter *filter_b;
678 
679  Assert(col_a->bv_attno == col_b->bv_attno);
680  Assert(!col_a->bv_allnulls && !col_b->bv_allnulls);
681 
682  filter_a = (BloomFilter *) PG_DETOAST_DATUM(col_a->bv_values[0]);
683  filter_b = (BloomFilter *) PG_DETOAST_DATUM(col_b->bv_values[0]);
684 
685  /* make sure the filters use the same parameters */
686  Assert(filter_a && filter_b);
687  Assert(filter_a->nbits == filter_b->nbits);
688  Assert(filter_a->nhashes == filter_b->nhashes);
689  Assert((filter_a->nbits > 0) && (filter_a->nbits % 8 == 0));
690 
691  nbytes = (filter_a->nbits) / 8;
692 
693  /* simply OR the bitmaps */
694  for (i = 0; i < nbytes; i++)
695  filter_a->data[i] |= filter_b->data[i];
696 
697  PG_RETURN_VOID();
698 }
699 
700 /*
701  * Cache and return inclusion opclass support procedure
702  *
703  * Return the procedure corresponding to the given function support number
704  * or null if it does not exist.
705  */
706 static FmgrInfo *
707 bloom_get_procinfo(BrinDesc *bdesc, uint16 attno, uint16 procnum)
708 {
709  BloomOpaque *opaque;
710  uint16 basenum = procnum - PROCNUM_BASE;
711 
712  /*
713  * We cache these in the opaque struct, to avoid repetitive syscache
714  * lookups.
715  */
716  opaque = (BloomOpaque *) bdesc->bd_info[attno - 1]->oi_opaque;
717 
718  /*
719  * If we already searched for this proc and didn't find it, don't bother
720  * searching again.
721  */
722  if (opaque->extra_proc_missing[basenum])
723  return NULL;
724 
725  if (opaque->extra_procinfos[basenum].fn_oid == InvalidOid)
726  {
727  if (RegProcedureIsValid(index_getprocid(bdesc->bd_index, attno,
728  procnum)))
729  {
730  fmgr_info_copy(&opaque->extra_procinfos[basenum],
731  index_getprocinfo(bdesc->bd_index, attno, procnum),
732  bdesc->bd_context);
733  }
734  else
735  {
736  opaque->extra_proc_missing[basenum] = true;
737  return NULL;
738  }
739  }
740 
741  return &opaque->extra_procinfos[basenum];
742 }
743 
744 Datum
746 {
748 
749  init_local_reloptions(relopts, sizeof(BloomOptions));
750 
751  add_local_real_reloption(relopts, "n_distinct_per_range",
752  "number of distinct items expected in a BRIN page range",
754  -1.0, INT_MAX, offsetof(BloomOptions, nDistinctPerRange));
755 
756  add_local_real_reloption(relopts, "false_positive_rate",
757  "desired false-positive rate for the bloom filters",
761  offsetof(BloomOptions, falsePositiveRate));
762 
763  PG_RETURN_VOID();
764 }
765 
766 /*
767  * brin_bloom_summary_in
768  * - input routine for type brin_bloom_summary.
769  *
770  * brin_bloom_summary is only used internally to represent summaries
771  * in BRIN bloom indexes, so it has no operations of its own, and we
772  * disallow input too.
773  */
774 Datum
776 {
777  /*
778  * brin_bloom_summary stores the data in binary form and parsing text
779  * input is not needed, so disallow this.
780  */
781  ereport(ERROR,
782  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
783  errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
784 
785  PG_RETURN_VOID(); /* keep compiler quiet */
786 }
787 
788 
789 /*
790  * brin_bloom_summary_out
791  * - output routine for type brin_bloom_summary.
792  *
793  * BRIN bloom summaries are serialized into a bytea value, but we want
794  * to output something nicer humans can understand.
795  */
796 Datum
798 {
799  BloomFilter *filter;
801 
802  /* detoast the data to get value with a full 4B header */
804 
806  appendStringInfoChar(&str, '{');
807 
808  appendStringInfo(&str, "mode: hashed nhashes: %u nbits: %u nbits_set: %u",
809  filter->nhashes, filter->nbits, filter->nbits_set);
810 
811  appendStringInfoChar(&str, '}');
812 
813  PG_RETURN_CSTRING(str.data);
814 }
815 
816 /*
817  * brin_bloom_summary_recv
818  * - binary input routine for type brin_bloom_summary.
819  */
820 Datum
822 {
823  ereport(ERROR,
824  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
825  errmsg("cannot accept a value of type %s", "pg_brin_bloom_summary")));
826 
827  PG_RETURN_VOID(); /* keep compiler quiet */
828 }
829 
830 /*
831  * brin_bloom_summary_send
832  * - binary output routine for type brin_bloom_summary.
833  *
834  * BRIN bloom summaries are serialized in a bytea value (although the
835  * type is named differently), so let's just send that.
836  */
837 Datum
839 {
840  return byteasend(fcinfo);
841 }
int16 AttrNumber
Definition: attnum.h:21
uint32 BlockNumber
Definition: block.h:31
static bool BlockNumberIsValid(BlockNumber blockNumber)
Definition: block.h:71
#define BrinGetPagesPerRange(relation)
Definition: brin.h:40
static int brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
Definition: brin_bloom.c:500
#define PROCNUM_HASH
Definition: brin_bloom.c:146
#define BLOOM_SEED_1
Definition: brin_bloom.c:225
#define BLOOM_DEFAULT_NDISTINCT_PER_RANGE
Definition: brin_bloom.c:179
Datum brin_bloom_consistent(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:598
static void bloom_filter_size(int ndistinct, double false_positive_rate, int *nbytesp, int *nbitsp, int *nhashesp)
Definition: brin_bloom.c:274
#define BloomGetNDistinctPerRange(opts)
Definition: brin_bloom.c:199
struct BloomFilter BloomFilter
static BloomFilter * bloom_init(int ndistinct, double false_positive_rate)
Definition: brin_bloom.c:313
Datum brin_bloom_options(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:745
struct BloomOptions BloomOptions
#define BLOOM_MAX_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:196
#define BLOOM_DEFAULT_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:197
static bool bloom_contains_value(BloomFilter *filter, uint32 value)
Definition: brin_bloom.c:410
#define BLOOM_MAX_PROCNUMS
Definition: brin_bloom.c:145
#define BloomGetFalsePositiveRate(opts)
Definition: brin_bloom.c:204
static BloomFilter * bloom_add_value(BloomFilter *filter, uint32 value, bool *updated)
Definition: brin_bloom.c:373
#define BloomMaxFilterSize
Definition: brin_bloom.c:214
#define BLOOM_MIN_NDISTINCT_PER_RANGE
Definition: brin_bloom.c:172
#define BloomEqualStrategyNumber
Definition: brin_bloom.c:136
struct BloomOpaque BloomOpaque
#define BLOOM_MIN_FALSE_POSITIVE_RATE
Definition: brin_bloom.c:195
#define BLOOM_SEED_2
Definition: brin_bloom.c:226
#define PROCNUM_BASE
Definition: brin_bloom.c:152
Datum brin_bloom_union(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:670
Datum brin_bloom_summary_send(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:838
Datum brin_bloom_summary_out(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:797
static FmgrInfo * bloom_get_procinfo(BrinDesc *bdesc, uint16 attno, uint16 procnum)
Definition: brin_bloom.c:707
Datum brin_bloom_add_value(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:543
Datum brin_bloom_summary_in(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:775
Datum brin_bloom_summary_recv(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:821
Datum brin_bloom_opcinfo(PG_FUNCTION_ARGS)
Definition: brin_bloom.c:453
#define SizeofBrinOpcInfo(ncols)
Definition: brin_internal.h:41
unsigned short uint16
Definition: c.h:494
unsigned int uint32
Definition: c.h:495
#define RegProcedureIsValid(p)
Definition: c.h:766
#define Min(x, y)
Definition: c.h:993
#define MAXALIGN(LEN)
Definition: c.h:800
signed int int32
Definition: c.h:483
#define PG_USED_FOR_ASSERTS_ONLY
Definition: c.h:171
#define Max(x, y)
Definition: c.h:987
#define FLEXIBLE_ARRAY_MEMBER
Definition: c.h:387
unsigned char uint8
Definition: c.h:493
size_t Size
Definition: c.h:594
int errcode(int sqlerrcode)
Definition: elog.c:860
int errmsg(const char *fmt,...)
Definition: elog.c:1075
#define ERROR
Definition: elog.h:39
#define ereport(elevel,...)
Definition: elog.h:149
Datum FunctionCall1Coll(FmgrInfo *flinfo, Oid collation, Datum arg1)
Definition: fmgr.c:1129
void fmgr_info_copy(FmgrInfo *dstinfo, FmgrInfo *srcinfo, MemoryContext destcxt)
Definition: fmgr.c:580
#define PG_RETURN_VOID()
Definition: fmgr.h:349
#define PG_GETARG_POINTER(n)
Definition: fmgr.h:276
#define PG_RETURN_CSTRING(x)
Definition: fmgr.h:362
#define PG_GETARG_DATUM(n)
Definition: fmgr.h:268
#define PG_DETOAST_DATUM_PACKED(datum)
Definition: fmgr.h:248
#define PG_GET_OPCLASS_OPTIONS()
Definition: fmgr.h:342
#define PG_DETOAST_DATUM(datum)
Definition: fmgr.h:240
#define PG_GETARG_INT32(n)
Definition: fmgr.h:269
#define PG_RETURN_POINTER(x)
Definition: fmgr.h:361
#define PG_GET_COLLATION()
Definition: fmgr.h:198
#define PG_FUNCTION_ARGS
Definition: fmgr.h:193
#define PG_RETURN_BOOL(x)
Definition: fmgr.h:359
#define newval
uint64 hash_bytes_uint32_extended(uint32 k, uint64 seed)
Definition: hashfn.c:631
#define MaxHeapTuplesPerPage
Definition: htup_details.h:572
FmgrInfo * index_getprocinfo(Relation irel, AttrNumber attnum, uint16 procnum)
Definition: indexam.c:863
RegProcedure index_getprocid(Relation irel, AttrNumber attnum, uint16 procnum)
Definition: indexam.c:829
static struct @148 value
int i
Definition: isn.c:73
if(TABLE==NULL||TABLE_index==NULL)
Definition: isn.c:77
Assert(fmt[strlen(fmt) - 1] !='\n')
void * palloc0(Size size)
Definition: mcxt.c:1232
static AmcheckOptions opts
Definition: pg_amcheck.c:111
const void size_t len
const void * data
static uint32 DatumGetUInt32(Datum X)
Definition: postgres.h:222
static Datum PointerGetDatum(const void *X)
Definition: postgres.h:322
uintptr_t Datum
Definition: postgres.h:64
#define InvalidOid
Definition: postgres_ext.h:36
unsigned int Oid
Definition: postgres_ext.h:31
void init_local_reloptions(local_relopts *relopts, Size relopt_struct_size)
Definition: reloptions.c:736
void add_local_real_reloption(local_relopts *relopts, const char *name, const char *desc, double default_val, double min_val, double max_val, int offset)
Definition: reloptions.c:974
#define SK_ISNULL
Definition: skey.h:115
void appendStringInfo(StringInfo str, const char *fmt,...)
Definition: stringinfo.c:97
void appendStringInfoChar(StringInfo str, char ch)
Definition: stringinfo.c:194
void initStringInfo(StringInfo str)
Definition: stringinfo.c:59
uint8 nhashes
Definition: brin_bloom.c:254
char data[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_bloom.c:259
uint32 nbits_set
Definition: brin_bloom.c:256
uint32 nbits
Definition: brin_bloom.c:255
uint16 flags
Definition: brin_bloom.c:251
int32 vl_len_
Definition: brin_bloom.c:248
FmgrInfo extra_procinfos[BLOOM_MAX_PROCNUMS]
Definition: brin_bloom.c:444
bool extra_proc_missing[BLOOM_MAX_PROCNUMS]
Definition: brin_bloom.c:445
double falsePositiveRate
Definition: brin_bloom.c:161
int32 vl_len_
Definition: bloom.h:103
double nDistinctPerRange
Definition: brin_bloom.c:160
BrinOpcInfo * bd_info[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_internal.h:62
Relation bd_index
Definition: brin_internal.h:50
MemoryContext bd_context
Definition: brin_internal.h:47
TypeCacheEntry * oi_typcache[FLEXIBLE_ARRAY_MEMBER]
Definition: brin_internal.h:37
uint16 oi_nstored
Definition: brin_internal.h:28
bool oi_regular_nulls
Definition: brin_internal.h:31
void * oi_opaque
Definition: brin_internal.h:34
Datum * bv_values
Definition: brin_tuple.h:34
AttrNumber bv_attno
Definition: brin_tuple.h:31
bool bv_allnulls
Definition: brin_tuple.h:33
Definition: fmgr.h:57
Oid fn_oid
Definition: fmgr.h:59
TypeCacheEntry * lookup_type_cache(Oid type_id, int flags)
Definition: typcache.c:345
#define SET_VARSIZE(PTR, len)
Definition: varatt.h:305
Datum bit(PG_FUNCTION_ARGS)
Definition: varbit.c:391
Datum byteasend(PG_FUNCTION_ARGS)
Definition: varlena.c:490