all Flashcards
(40 cards)
1
Q
Which are the security protocols
A
- PLAINTEXT
- SSL
- SASL_PLAINTEXT
- SASL_SSL
2
Q
broker.id
A
- Broker config
- General broker parameter
- integer
3
Q
listeners
A
- Broker config
- General broker parameter
- comma-separated list of URIs
- URI look like:
<protocol>://<hostname>:<port> e.g. SSL://localhost:9091
4
Q
what happens if a broker’s listener port is lower than 1024
A
Kafka must be started as root
5
Q
listener.security.protocol.map
A
- General broker parameter
- configured if a listener name is not a common security protocol
6
Q
zookepeer.connect
A
- Broker config
- General broker parameter
- semicolon-separated (semicolon) list of hostname:port/path
path is optional chroot path
7
Q
log.dirs
A
- Broker config
- General broker parameter
- the directories where log segments are stored
- one partition’s log segments are stored within the same path
- broker will store partitions in “least used” fashion
-defaults to log.dir (singular) if missing
8
Q
num.recovery.threads.per.data.dir
A
- Broker config
- General broker parameter
- num threads per log dir
- threads are used to:
- open log segment files
- close log segment files
- check and truncate log segment files after failure
- safe to increase their number
9
Q
auto.create.topics.enable
A
- Broker config
- General broker parameter
- the broker will automatically create topic when:
- producer starts writing
- consumer starts reading
- any client requests metadata
10
Q
auto.leader.rebalance.enable
A
- Broker config
- General broker parameter
- enables background thread checking distribution of partitions
- seeks to avoid having topic leadership concentrated in one or few brokers
11
Q
leader.imbalance.check.interval.seconds
A
- Broker config
- General broker parameter
- every how many seconds the broker will check for partition leader imbalances
12
Q
leader.imbalance.per.broker.percentage
A
- Broker config
- General broker parameter
- if leadership imbalance exceeds this value, then a rebalance is initiated
13
Q
delete.topic.enable
A
- General broker parameter
- dangerous
14
Q
num.partitions
A
- Broker config
- topic default
- defaults to 1
-primarily used when auto topic creation is enabled - partitons can never be decreased
15
Q
default.replication.factor
A
- Broker config
- topic default
- if auto-topic creation enabled, this value sets the replication factor
- should be at least 1 over the min.insync.replicas (RF+)
- even better is RF++ to allow maintenance and prevent outages
16
Q
log.retention.ms
A
- Broker config
- topic default
- takes precedence over log.retention.minutes and log.retention.hours
- how long kafka will retain messages
- retention is performed by examining the last modified time on each log segment file on disk. The tome the log segment was closed.
- this retention is on topic level
- if log.retention.bytes has also been configured, messages may be removed when either criteria is met
17
Q
log.retention.minutes
A
- Broker config
- topic default
- takes precedence over log.retention.hours
18
Q
log.retention.hours
A
- Broker config
- topic default
- see log.retention.ms
19
Q
log.retention.bytes
A
- Broker config
- topic default
- applied per partition (bytes per partition, so adding partitions increases total topic retention size
- can happen to have both this and log.retention.ms set… then messages may be removed when either criteria is met
20
Q
log.segment.bytes
A
- Broker config
- topic default
- defaults to 1GB
- once segment reaches the size soecified in the log.segment.bytes, the segment is closed and it can be considered for expiration
21
Q
log.roll.ms
A
- Broker config
- topic default
- the amount of time after which a log segment should be closed
- not mutually exclusive with log.segment.bytes
- consider that multiple log segments will be closed at the same time (impact on disk performance) for low volume partitions
22
Q
min.insync.replicas
A
- Broker config
- topic default
- defaults to 1
- how many replicas need to acknowledge the write for it to be successful
- setting it to 2 ensures 2 replicas are in sync with the producer
23
Q
message.max.bytes
A
- Broker config
- topic default
- defaults to 1MB
- messages larger than this value will not be accepted and producer will get error message
- this value is the max size of a compressed message
- must be coordinated with the configs:
- fetch.message.max.bytes
- replica.fetch.max.bytes
24
Q
Major factors for performance bottlenecks
A
- disk throughput
- disk capacity
- memory
- CPU
- networking
25
Faster disk writes =
lower produce latency
26
What part of memory is more important for Kafka
Page Cache, the heap is just for the JVM and 5GB will do for 150k messages / second and data rate of 200 megabits per second
27
Why is there a networking imbalance
outbound traffic higher than inbound (many consumers for one producer). Recommended 10GB NICs
28
Does Kafka need extremely performant CPU
No, kafka uses CPU to decompress message batches to validate the checksum and then recompresses the batches... that's all
29
Kafka per broker size recommendations
- < 14K partition replicas
- < 1M replicas per cluster
30
Broker configuration requirements
- all brokers must have same `zookeper.connect
- all brokers must have unique `broker.id
31
OS Tuning - Virtual Memory
- set vm.swappiness = 1 (i.e. do not swap unless there is an out-of-memory condition)
- vm.dirty_background_ratio = 5 (default is 10), it's a % of total system memory)
- vm.dirty_ratio = 60 to 80 (default is 20, % of total system memory before synchronous flush to disk.
- vm.max_map_count = 400k to 600k (these are the files descriptor needed)
- vm.overcommit = 0 (it's the default)
32
vm.swappiness
- OS virtual memory setting
- set vm.swappiness = 1 (i.e. do not swap unless there is an out-of-memory condition)
33
vm.dirty_background_ratio
- OS virtual memory setting
- set vm.dirty_background_ratio = 5 (default is 10), it's a % of total system memory allowed in dirty pages before process to flush them to disk starts)
34
vm.dirty_ratio
- OS virtual memory setting
- set vm.dirty_ratio = 60 to 80 (default is 20, % of total system memory before synchronous flush to disk.
35
vm.max_map_count
- OS virtual memory setting
- vm.max_map_count = 400k to 600k (these are the files descriptor needed)
36
vm.overcommit
- OS virtual memory setting
- vm.overcommit = 0 (it's the default). setting to 0 means the kernel determines the amount of free memory from an application
37
OS tuning - Disk
- XFS filesystem better tan Ext4
- set `noatime mount option (i.e. no access-time writes. Disabling acces-time writes is safe)
- set `largeio which improves efficiency for larger disk writes
38
OS tuning - networking
- increase socket buffer sizes
1. net.core.wmem.default = 131072 (128KiB)
2. net.core.rmem.default = 131072 (128KiB)
3. net.core.wmem.max = 2097152 (2MiB)
4. net.core.rmem.max = 2097152 (2MiB)
- increase TCP socket buffer sizes
1. net.ipv4.tcp_wmem=
2. net.ipv4.tcp_rmem=
e.g. 4096 65536 2048000 (4KiB, 64KiB, 2MiB)
- net.ipv4.tcp_window_scaling=1 allows more efficient data transfers
- net.ipv4.tcp_max_syn_backlog= above 1024 allows more simultaneous connections
- net.core.netdev_max_backlog= more than 1000 good for bursts of network traffic
39
Kafka producer - mandatory properties
1. bootstrap.servers
2. key.serializer
3. value.serializer
40
kafka producer - primary send methods
1. fire-and-forget
2. synchronous send
3. asynchronous send