Big Data Flashcards

1
Q

Was macht Big Data aus?

A

Volume, Variety, Velocity, Veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Was ist der Software Stack für Big Data Management?

A

Data Analysis
NoSQL, Search, Streaming oder SQL, Scripting
Data Processing Framework
Data Storage

parallel: Resource Management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Was unterscheidet Spark von MapReduce?

A

Iteratives Vorgehen erleichtert
Invariante Daten (Resilient Distributed Dataset, RDD)
lazy => Optimierungen (Pipelining)
Lineage Graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Was sind die Vorteile von MapReduce gegenüber SQL?

A

Flexibilität
Skalierbarkeit
Effizienz
Fehlertoleranz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Welche MapReduce Joins wurden vorgestellt?

A

Natural Join / Equi-Join
- Repartition Join
- Semi-Join
Theta-Join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly