Chapter 1: Meet Kafka Flashcards

1
Q

Batches can contain messages from multiple partitions. T/F

A

F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Batches can contain messages from multiple topics. T/F

A

F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the benefit of a larger batch size?

A

Increased throughput

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the benefit of a smaller batch size?

A

Decreased latency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Kafka reduce the bytes in a batch before sending it across the network

A

Compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 most common schema types

A

JSON, XML, Arvo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do Arvo messages achieve a smaller size than JSON or XML messages?

A

Separating message payload and schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Are messages guaranteed to be ordered within a topic?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Are messages guaranteed to be ordered within a partition?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do partitions increase scalability and redundancy?

A

Splitting partitions across servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A producer is about to produce a message which has no key, and the producer is not using a custom partitioner. How does the producer decide which partition to use?

A

The producer will distribute the message evenly across partitions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe two different methods of ensuring that two messages will be written to the same partition?

A

Use a custom partitioner or give both messages the same key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If I were to produce two messages to a topic, how could I ensure that those messages were consumed in the same order they were produced?

A

Put the messages in the same partition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the data type of an offset?

A

An integer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When a consumer restarts, how does it decide which message it should start reading

A

It reads the offset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two places that the offset could be stored?

A

Zookeeper or Kafka

17
Q

What is the cardinality between consumer groups and partitions?

A

One-to-many

18
Q

How could one increase the throughput of a consumer group?

A

Adding more consumers

19
Q

Consumer A owns partition B. How does Kafka ensure that partition B will continue to be processed in the event that consumer A dies?

A

Kafka re-assigns partition B to another consumer from the group of consumer A

20
Q

True or false? A single broker can handle millions of partitions.

21
Q

True or false? A single broker can handle thousands of partitions.

22
Q

True or false? A single broker can handle millions of messages per second.

23
Q

Within a cluster, how many brokers are responsible for assigning partitions to consumers?

24
Q

What is the name of a broker that is responsible for assigning partitions to brokers

A

Controller

25
Fill in the blank: all producers and consumers of a partition can be connected to a single broker, called the ___
Leader
26
How does Kafka ensure redundancy of messages in a partition?
By replicating the partition in multiple brokers
27
Describe two simple ways that I could use to make a Kafka topic store messages for 1 month?
Change the broker retention setting or topic retention setting to 1 month
28
How could I limit the size of data (in bytes) stored in a Kafka topic to 1 GB, without affecting other topics?
Change the topic retention settings
29
How could I limit a Kafka topic so that only the most recent message is stored
Change the topic to be log compacted
30
Name 3 ways that one could increase the throughput of a topic?
It depends on the bottleneck. Options include: - Increase batch size - Increase number of consumers - Increase number of brokers - Increase number of partitions - Increase computing power of servers - Make consumer processing code more efficient - Decrease message size
31
A topic is experiencing high latency between the producer and the consumer. What could be done to reduce this latency?
Decrease batch size, decrease message size, address any bottlenecks in the system (e.g not enough brokers)
32
How could I enforce the processing order of two messages?
Put the messages in the same topic and partition
33
A topic was being processed quickly but a spike in message frequency has increased the processing time of messages. The brokers have plenty of computing power to spare so they are not the bottleneck. How could I increase the speed at which messages are processed without altering the code which processes messages?
Increase the number of consumers and/or number of partitions