Bloom Filters Flashcards

(47 cards)

1
Q

What is a Bloom filter?

A

A space-efficient probabilistic data structure used to test whether an element is in a set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can Bloom filters return false positives?

A

Yes, they may return false positives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can Bloom filters return false negatives?

A

No, they never return false negatives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a false positive in a Bloom filter?

A

When the filter says an element is in the set but it is not.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the main components of a Bloom filter?

A

A bit array and multiple independent hash functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the time complexity of adding an element to a Bloom filter?

A

O(k), where k is the number of hash functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the time complexity of checking membership in a Bloom filter?

A

O(k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What happens when you insert an element into a Bloom filter?

A

Each hash function sets a bit in the bit array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens when you query a Bloom filter?

A

Each hash function checks a bit; if all are 1, the element may be in the set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you reduce false positive rates in a Bloom filter?

A

Increase the size of the bit array or use more hash functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the space complexity of a Bloom filter?

A

O(m), where m is the size of the bit array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are common use cases for Bloom filters?

A

Databases, web caching, spell checkers, and network systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the tradeoff in Bloom filters?

A

Smaller size increases false positives; larger size uses more memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the optimal number of hash functions?

A

k = (m/n) * ln(2), where m is bit array size and n is number of elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the false positive probability of a Bloom filter?

A

Approximately (1 - e^(-kn/m))^k.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens if you insert the same element multiple times?

A

No change; bits are already set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Are Bloom filters good for small datasets?

A

Not typically; the overhead may outweigh the benefits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Are Bloom filters deterministic?

A

No, they are probabilistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Can Bloom filters support deletion?

A

Standard Bloom filters cannot support deletion.

20
Q

What is a counting Bloom filter?

A

A variant that allows deletion by using counters instead of bits.

21
Q

What kind of hash functions are used in Bloom filters?

A

Fast, independent hash functions that produce uniform output.

22
Q

Can Bloom filters be used in distributed systems?

A

Yes, especially for quick duplicate detection.

23
Q

What is the initialization cost of a Bloom filter?

A

O(m), to set all bits to 0.

24
Q

Can you resize a Bloom filter?

A

Not easily; resizing affects hash function outputs.

25
What is the difference between a Bloom filter and a hash set?
A Bloom filter is space-efficient but allows false positives; a hash set does not.
26
Is a Bloom filter faster than a hash set?
Yes, for large datasets and membership checks.
27
What is the role of a Bloom filter in databases?
To quickly check if a key might exist before performing a full lookup.
28
How is a Bloom filter used in spell checking?
To rule out non-existent words quickly.
29
How is a Bloom filter used in web caching?
To avoid redundant fetches for known non-cached items.
30
What is the bit array in a Bloom filter?
A binary array used to track possible presence of elements.
31
Can Bloom filters be combined?
Yes, via bitwise OR if they use the same parameters.
32
Can Bloom filters be used for exact membership?
No, they are approximate structures.
33
What is the advantage of Bloom filters over hash sets?
Much lower memory usage.
34
What is the disadvantage of Bloom filters?
They have a non-zero probability of false positives.
35
How is the size of the bit array chosen?
Based on the desired false positive rate and expected number of elements.
36
How is the number of hash functions chosen?
To minimize the false positive rate given the bit array size.
37
How do you implement multiple hash functions efficiently?
Use a base hash and derive others from it.
38
What is a real-world system that uses Bloom filters?
Apache HBase, Apache Cassandra, and Google Bigtable.
39
What is the purpose of the ln(2) in the hash function formula?
It optimizes the false positive rate.
40
Can Bloom filters be serialized?
Yes, they can be stored and transmitted.
41
Are Bloom filters secure?
No, they are not cryptographically secure.
42
Can Bloom filters be used for counting frequencies?
No, standard Bloom filters cannot count.
43
What is a scalable Bloom filter?
A dynamic Bloom filter that grows as more elements are added.
44
What is a compressed Bloom filter?
A smaller version optimized for transmission/storage.
45
What is the effect of too many hash functions?
Increased time complexity and more false positives.
46
What is the effect of too few hash functions?
Higher false positive probability.
47
Can Bloom filters store values?
No, they only track membership of keys.