Storm Arch Flashcards
What are the type of nodes in Storm? And what process do they run?
Master and Worker nodes. Master node runs the Nimbus daemon, and Worker nodes run the “supervisor” daemon.
What is the Nimbus role?
Nimbus distribute code on the cluster, assigns tasks to machines, and monitors them.
What is the Supervisor’s role?
Receives assignment from the Nimbus and creates the “worker” processes to execute it. The “worker” executes a sub-set of a topology.
What is a topology?
it is a group of worker processes distributes across many machines. It is a graph of stream where each node is a Spout or a Bolt.
True or False? The Nimbus and Supervisor are stateful
False. They are both stateless and fail-fast (if dies it will re-start immediately).
What is the role of Zookeeper cluster in Storm?
it is the coordinator between Numbus and Supervisor daemons. It keeps the state as they are stateless.
What is a stream?
It is an unbounded sequence of tuples.
What is an spout?
Spout is a source of streams. It may read tuples from different sources and forward them as stream.
What is a Bolt?
Bolt is a consumer of streams to apply different types computation on them. Bolts are very power points of processing.
What happens when you submit a “topology” to be executed in the Storm cluster?
It will run forever, and will be restart if any fail happens. No data loss should happens in case of failure.
How does storm keep its data models?
Storm data models are kept in tuples, which is a named list of values, where a field can be an object of any type. If you need to use any different type of object you just need to use a serializer.
Each node in a topology will declare the tuple fields it will emit.
What is zeromq? What are some of the characteristics of it.
zeromq is an networking library. Some features are socket library that can be used as a concurrent framework, faster than TCP for clusters, Async I/O for multicore passing-messsage applications
What are Storm operation modes? What are the difference them?
Local Mode and Remote Mode. Local Mode runs in a single JVM and is good for development and tests, but you need to make sure it runs in thread safe before deploying in Remote Mode. In Remote Mode you run in a Storm cluster, but in this mode you cannot run in debug mode. It is always good to have one single machine running in Local Mode.