Bloom Filters Flashcards
(47 cards)
What is a Bloom filter?
A space-efficient probabilistic data structure used to test whether an element is in a set.
Can Bloom filters return false positives?
Yes, they may return false positives.
Can Bloom filters return false negatives?
No, they never return false negatives.
What is a false positive in a Bloom filter?
When the filter says an element is in the set but it is not.
What are the main components of a Bloom filter?
A bit array and multiple independent hash functions.
What is the time complexity of adding an element to a Bloom filter?
O(k), where k is the number of hash functions.
What is the time complexity of checking membership in a Bloom filter?
O(k)
What happens when you insert an element into a Bloom filter?
Each hash function sets a bit in the bit array.
What happens when you query a Bloom filter?
Each hash function checks a bit; if all are 1, the element may be in the set.
How do you reduce false positive rates in a Bloom filter?
Increase the size of the bit array or use more hash functions.
What is the space complexity of a Bloom filter?
O(m), where m is the size of the bit array.
What are common use cases for Bloom filters?
Databases, web caching, spell checkers, and network systems.
What is the tradeoff in Bloom filters?
Smaller size increases false positives; larger size uses more memory.
What is the optimal number of hash functions?
k = (m/n) * ln(2), where m is bit array size and n is number of elements.
What is the false positive probability of a Bloom filter?
Approximately (1 - e^(-kn/m))^k.
What happens if you insert the same element multiple times?
No change; bits are already set.
Are Bloom filters good for small datasets?
Not typically; the overhead may outweigh the benefits.
Are Bloom filters deterministic?
No, they are probabilistic.
Can Bloom filters support deletion?
Standard Bloom filters cannot support deletion.
What is a counting Bloom filter?
A variant that allows deletion by using counters instead of bits.
What kind of hash functions are used in Bloom filters?
Fast, independent hash functions that produce uniform output.
Can Bloom filters be used in distributed systems?
Yes, especially for quick duplicate detection.
What is the initialization cost of a Bloom filter?
O(m), to set all bits to 0.
Can you resize a Bloom filter?
Not easily; resizing affects hash function outputs.