lec4 Flashcards

(33 cards)

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is parallel computing?

A

The foundation of modern cloud computing, enabling the execution of multiple computations simultaneously to improve performance and scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main models of parallelism?

A
  • Task Parallelism
  • Data Parallelism
  • Pipeline Parallelism
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define Task Parallelism.

A

Different tasks are executed simultaneously across different computing units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define Data Parallelism.

A

The same operation is performed concurrently on subsets of a large dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define Pipeline Parallelism.

A

A sequence of processing stages where the output of one stage serves as input to the next.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does parallel computing in the cloud work?

A

It is facilitated by distributed architectures that dynamically allocate resources as needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is MapReduce?

A

A programming model designed for processing large datasets in parallel across a distributed cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two main phases of MapReduce?

A
  • Map Phase
  • Reduce Phase
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What occurs in the Map Phase of MapReduce?

A

Input data is split into smaller chunks and processed in parallel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What occurs in the Reduce Phase of MapReduce?

A

The processed data is aggregated and combined to generate the final output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does Hadoop provide?

A
  • HDFS (Hadoop Distributed File System)
  • YARN (Yet Another Resource Negotiator)
  • MapReduce API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is HDFS?

A

A scalable and fault-tolerant storage system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is YARN?

A

A resource management layer that schedules and allocates resources efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the limitation of Hadoop?

A

Its batch-processing nature makes it less suitable for real-time applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Apache Spark?

A

An in-memory distributed computing engine that processes data much faster than Hadoop.

17
Q

What is the RDD abstraction in Spark?

A

Resilient Distributed Dataset, allowing data to be processed in-memory with fault tolerance.

18
Q

List key features of Apache Spark.

A
  • RDDs
  • DAG (Directed Acyclic Graph) Execution
  • Streaming Capabilities
  • Integration with Machine Learning
  • Graph Processing
19
Q

What is serverless computing?

A

Allows developers to deploy code without managing underlying infrastructure.

20
Q

What is Function-as-a-Service (FaaS)?

A

A cloud execution model where individual functions are executed in response to events.

21
Q

What are the benefits of serverless computing?

A
  • Automatic Scaling
  • Cost Efficiency
  • Reduced Operational Overhead
22
Q

Name popular FaaS providers.

A
  • AWS Lambda
  • Google Cloud Functions
  • Azure Functions
23
Q

What are cloud-native applications?

A

Applications designed specifically for cloud environments.

24
Q

What architecture do cloud-native applications leverage?

A
  • Microservices Architecture
  • Containerization
  • Orchestration
  • CI/CD Pipelines
25
Define Microservices Architecture.
Divides applications into modular, loosely coupled services, each responsible for a specific function.
26
What is Docker?
A containerization platform that packages applications and dependencies into lightweight, portable containers.
27
What is Kubernetes?
An orchestration system that manages containerized applications at scale.
28
What is a service mesh?
A tool for adding security, reliability, and observability features to cloud native applications.
29
How is a service mesh implemented?
As a scalable set of network proxies deployed alongside application code.
30
What does the service mesh's data plane consist of?
Proxies that handle communication between microservices.
31
What is event-driven computing?
A paradigm where applications respond to events, crucial for building real-time, scalable cloud applications.
32
List key technologies in event-driven computing.
* Message Queues * Event-Driven FaaS * Stream Processing
33
What is the conclusion about parallel programming in the cloud?
It has revolutionized modern computing, enabling efficient data processing, scalable applications, and resilient architectures.