Batch vs. Stream Processing Flashcards

Question 1

Q

What are two common approaches for processing data?

Answer

A

Batch and stream processing

Question 2

Q

What is batch processing?

Answer

A

Processing data in large, discrete blocks typically on an interval or after meeting some threshold.

Question 3

Q

What are two characteristics of batch processing?

Answer

A

Latency and throughput, generally batch processing will introduce latency (while waiting for it to be collected) and is high throughput.

Question 4

Q

What are two pros of batch processing?

Answer

A

Efficiency and simplicity.

It can be resource efficient for systems will large volumes of data (batches can be better optimized) and is generally simpler to implement than stream processing.

Question 5

Q

What are the two major cons of batch processing?

Answer

A

Delay in insights and inflexibility.

Since batches can typically require some amount of data before processing, there’s usually a delay in results (making it less practical for real-time scenarios) and it typically isn’t flexible enough to handle immediate changes or changes based on the data.

Question 6

Q

What is stream processing?

Answer

A

Stream processing involves continually processing data as soon as it arrives.

Question 7

Q

What are two characteristics of stream processing?

Answer

A

Immediate processing and real-time suitability

Question 8

Q

What are the two pros of stream processing?

Answer

A

Real-time analysis and dynamic data handling.

Since data is processed in real-time it allows systems to immediately provide insights and actions. It’s also more adaptable to changing data and conditions.

Question 9

Q

What are the two cons of stream processing?

Answer

A

Complexity and resource-intensity.

Stream processing is generally more complex/complicated than batch processing and can require significantly more resources to process data as it arrives.

Question 10

Q

When might you use batch processing? What about stream processing? Can you provide some real-world examples?

Answer

A

Batch is preferred in scenarios where you have all of the data available, such as financial reporting (e.g. weekly, daily etc.)

Stream processing is preferred in scenarios where real-time insights are required. Situations like fraud detection, analytics, etc.

Batch vs. Stream Processing Flashcards

(10 cards)