lec4 Flashcards
(33 cards)
What is parallel computing?
The foundation of modern cloud computing, enabling the execution of multiple computations simultaneously to improve performance and scalability.
What are the main models of parallelism?
- Task Parallelism
- Data Parallelism
- Pipeline Parallelism
Define Task Parallelism.
Different tasks are executed simultaneously across different computing units.
Define Data Parallelism.
The same operation is performed concurrently on subsets of a large dataset.
Define Pipeline Parallelism.
A sequence of processing stages where the output of one stage serves as input to the next.
How does parallel computing in the cloud work?
It is facilitated by distributed architectures that dynamically allocate resources as needed.
What is MapReduce?
A programming model designed for processing large datasets in parallel across a distributed cluster.
What are the two main phases of MapReduce?
- Map Phase
- Reduce Phase
What occurs in the Map Phase of MapReduce?
Input data is split into smaller chunks and processed in parallel.
What occurs in the Reduce Phase of MapReduce?
The processed data is aggregated and combined to generate the final output.
What does Hadoop provide?
- HDFS (Hadoop Distributed File System)
- YARN (Yet Another Resource Negotiator)
- MapReduce API
What is HDFS?
A scalable and fault-tolerant storage system.
What is YARN?
A resource management layer that schedules and allocates resources efficiently.
What is the limitation of Hadoop?
Its batch-processing nature makes it less suitable for real-time applications.
What is Apache Spark?
An in-memory distributed computing engine that processes data much faster than Hadoop.
What is the RDD abstraction in Spark?
Resilient Distributed Dataset, allowing data to be processed in-memory with fault tolerance.
List key features of Apache Spark.
- RDDs
- DAG (Directed Acyclic Graph) Execution
- Streaming Capabilities
- Integration with Machine Learning
- Graph Processing
What is serverless computing?
Allows developers to deploy code without managing underlying infrastructure.
What is Function-as-a-Service (FaaS)?
A cloud execution model where individual functions are executed in response to events.
What are the benefits of serverless computing?
- Automatic Scaling
- Cost Efficiency
- Reduced Operational Overhead
Name popular FaaS providers.
- AWS Lambda
- Google Cloud Functions
- Azure Functions
What are cloud-native applications?
Applications designed specifically for cloud environments.
What architecture do cloud-native applications leverage?
- Microservices Architecture
- Containerization
- Orchestration
- CI/CD Pipelines