Apache Hive Flashcards

1
Q

In one sentence, what is HIVE?

A

A system for managing and querying structured data built on top of Hadoop (HDFS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which technology does Hive rely on, and how?

A

Hive relies on MapReduce as its underlying processing engine. When executing queries, Hive translates them into a series of MapReduce jobs, leveraging the parallel processing capabilities of MapReduce to handle large-scale data processing tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you use Hive to retrieve data?

A

It allows for a typical SQL interface to a distributed file structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What other than a DFS can you use Hive on?

A

A CSV file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does Hive compare to a regular SQL DB?

A

Hive and traditional SQL databases differ in their design and use cases. While a regular SQL database is typically optimized for online transaction processing (OLTP) with low-latency queries, Hive is designed for online analytical processing (OLAP) and is well-suited for handling large-scale data warehousing and analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly