1. Intro to Big Data Flashcards

1
Q

Which of the following is NOT one of the 5 Vs of Big Data?
A) Velocity
B) Variety
C) Veracity
D) Virtualization

A

D) Virtualization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the main challenge in dealing with Big Data?
A) Finding enough data
B) Storing and analyzing the data
C) Keeping the data secure
D) Making the data visually appealing

A

B) Storing and analyzing the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Hadoop primarily used for?
A) Web development
B) Distributed data storage and processing
C) Graphic design
D) Game development

A

B) Distributed data storage and processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which scaling approach involves spreading data and processing across more machines?
A) Scale-up
B) Scale-down
C) Scale-out
D) Scale-in

A

C) Scale-out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In Hadoop’s HDFS, what is the role of the NameNode?
A) It stores the actual data
B) It manages the file system namespace and regulates access to files
C) It performs checkpointing for the NameNode
D) It is a client that reads and writes data

A

B) It manages the file system namespace and regulates access to files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following is a component of Hadoop’s architecture?
A) HBase
B) MapReduce
C) Cassandra
D) MongoDB

A

B) MapReduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of the SecondaryNameNode in HDFS?
A) It acts as a backup for the NameNode
B) It performs checkpointing for the NameNode
C) It stores the actual data
D) It manages the network traffic

A

B) It performs checkpointing for the NameNode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following statements about HDFS is true?
A) HDFS is designed for small files
B) HDFS stores multiple copies of data blocks for fault tolerance
C) HDFS uses a peer-to-peer architecture
D) HDFS is primarily used for real-time data processing

A

B) HDFS stores multiple copies of data blocks for fault tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is NOT a characteristic of Big Data?
A) Velocity
B) Veracity
C) Volatility
D) Volume

A

C) Volatility

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In Hadoop, the NameNode:
A) Stores actual data
B) Manages the file system namespace
C) Performs data processing
D) Acts as a data node

A

B) Manages the file system namespace

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In Hadoop, data is stored in:
A) The NameNode
B) The ResourceManager
C) HDFS
D) YARN

A

C) HDFS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly