Chapter 1: Meet Kafka Flashcards
Batches can contain messages from multiple partitions. T/F
F
Batches can contain messages from multiple topics. T/F
F
What is the benefit of a larger batch size?
Increased throughput
What is the benefit of a smaller batch size?
Decreased latency
How does Kafka reduce the bytes in a batch before sending it across the network
Compression
What are the 3 most common schema types
JSON, XML, Arvo
How do Arvo messages achieve a smaller size than JSON or XML messages?
Separating message payload and schema
Are messages guaranteed to be ordered within a topic?
No
Are messages guaranteed to be ordered within a partition?
Yes
How do partitions increase scalability and redundancy?
Splitting partitions across servers
A producer is about to produce a message which has no key, and the producer is not using a custom partitioner. How does the producer decide which partition to use?
The producer will distribute the message evenly across partitions
Describe two different methods of ensuring that two messages will be written to the same partition?
Use a custom partitioner or give both messages the same key
If I were to produce two messages to a topic, how could I ensure that those messages were consumed in the same order they were produced?
Put the messages in the same partition
What is the data type of an offset?
An integer
When a consumer restarts, how does it decide which message it should start reading
It reads the offset
What are the two places that the offset could be stored?
Zookeeper or Kafka
What is the cardinality between consumer groups and partitions?
One-to-many
How could one increase the throughput of a consumer group?
Adding more consumers
Consumer A owns partition B. How does Kafka ensure that partition B will continue to be processed in the event that consumer A dies?
Kafka re-assigns partition B to another consumer from the group of consumer A
True or false? A single broker can handle millions of partitions.
False
True or false? A single broker can handle thousands of partitions.
True
True or false? A single broker can handle millions of messages per second.
True
Within a cluster, how many brokers are responsible for assigning partitions to consumers?
One
What is the name of a broker that is responsible for assigning partitions to brokers
Controller