Storm Arch Flashcards

1
Q

What are the type of nodes in Storm? And what process do they run?

A

Master and Worker nodes. Master node runs the Nimbus daemon, and Worker nodes run the “supervisor” daemon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Nimbus role?

A

Nimbus distribute code on the cluster, assigns tasks to machines, and monitors them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Supervisor’s role?

A

Receives assignment from the Nimbus and creates the “worker” processes to execute it. The “worker” executes a sub-set of a topology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a topology?

A

it is a group of worker processes distributes across many machines. It is a graph of stream where each node is a Spout or a Bolt.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True or False? The Nimbus and Supervisor are stateful

A

False. They are both stateless and fail-fast (if dies it will re-start immediately).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the role of Zookeeper cluster in Storm?

A

it is the coordinator between Numbus and Supervisor daemons. It keeps the state as they are stateless.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a stream?

A

It is an unbounded sequence of tuples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an spout?

A

Spout is a source of streams. It may read tuples from different sources and forward them as stream.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Bolt?

A

Bolt is a consumer of streams to apply different types computation on them. Bolts are very power points of processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens when you submit a “topology” to be executed in the Storm cluster?

A

It will run forever, and will be restart if any fail happens. No data loss should happens in case of failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does storm keep its data models?

A

Storm data models are kept in tuples, which is a named list of values, where a field can be an object of any type. If you need to use any different type of object you just need to use a serializer.
Each node in a topology will declare the tuple fields it will emit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is zeromq? What are some of the characteristics of it.

A

zeromq is an networking library. Some features are socket library that can be used as a concurrent framework, faster than TCP for clusters, Async I/O for multicore passing-messsage applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are Storm operation modes? What are the difference them?

A

Local Mode and Remote Mode. Local Mode runs in a single JVM and is good for development and tests, but you need to make sure it runs in thread safe before deploying in Remote Mode. In Remote Mode you run in a Storm cluster, but in this mode you cannot run in debug mode. It is always good to have one single machine running in Local Mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly