Theory Flashcards
(50 cards)
What is an algorithm?
An algorithm is a set of well-defined, step-by-step instructions or rules designed to solve a specific problem or accomplish a task. Algorithms are used across various fields, especially in computer science, where they form the basis for writing programs that can perform tasks, make decisions, and process data.
In essence, an algorithm takes an input, processes it through a series of logical steps, and produces an output. They vary in complexity, from simple tasks like sorting numbers to intricate algorithms used in artificial intelligence and machine learning.
A good algorithm is typically efficient (in terms of time and resources) and effective, providing the correct result within a reasonable amount of time.
Why study algorithms and performance?
Studying algorithms and their performance is crucial for several reasons:
1. Efficiency: Algorithms play a key role in determining how quickly and efficiently a task can be completed. By understanding algorithms, we can develop solutions that minimize time and resource consumption, which is critical in fields where processing speed and scalability matter, like data analysis, machine learning, and real-time applications. 2. Optimization: Many tasks can be performed in different ways, but some approaches are more efficient than others. Studying algorithms allows us to compare and choose optimal solutions, reducing unnecessary computations and improving performance in practical applications, especially for large datasets or complex operations. 3. Problem-Solving Skills: Studying algorithms enhances analytical and problem-solving abilities. It teaches structured thinking and how to break down complex problems into manageable steps, an essential skill for software developers, engineers, and data scientists. 4. Scalability and Cost Savings: Efficient algorithms help applications scale without requiring proportionally larger resources, which can lead to significant cost savings. For businesses, efficient software means fewer server resources, lower power consumption, and, ultimately, reduced operational costs. 5. Foundational Knowledge: Algorithms are foundational to computer science and programming. Understanding them helps in better grasping advanced topics like machine learning, cryptography, distributed systems, and even theoretical computer science concepts. 6. Innovation and Competitive Advantage: The ability to create and apply efficient algorithms can provide a competitive edge. Companies like Google, Amazon, and Facebook invest heavily in algorithm research to enhance their search engines, recommendation systems, and overall user experience, setting them apart from competitors.
In short, studying algorithms and performance equips us with the skills to build faster, more reliable, and cost-effective solutions, making it fundamental in a world that increasingly relies on technology.
What are the kinds of algorithms complexity analysis?
Algorithms complexity analysis involves evaluating the resources (time and space) that an algorithm uses as a function of the input size. There are several types of complexity analysis based on different factors, such as input conditions or the resources being measured. Below are the most common kinds:
- Time Complexity Analysis
This measures the amount of time an algorithm takes to run as a function of the input size, typically denoted as n. There are different methods for analyzing time complexity:
a. Worst-case Complexity:
• Definition: It measures the maximum time an algorithm could take, given any input of size n. • Purpose: It provides an upper bound, ensuring the algorithm performs within acceptable limits even for the hardest possible input. • Example: Searching an element in an unsorted array using linear search would take O(n) time in the worst case (when the element is at the end or not present).
b. Best-case Complexity:
• Definition: It measures the minimum time an algorithm takes for the most favorable input of size n. • Purpose: While not useful for performance guarantees, it helps understand how the algorithm behaves for easy cases. • Example: In linear search, the best-case time complexity is O(1) if the element is found at the first position.
c. Average-case Complexity:
• Definition: It provides the expected time complexity for a “typical” input of size n, averaged over all possible inputs or using a probability distribution of inputs. • Purpose: It gives a more realistic estimate of the algorithm’s performance under normal conditions. • Example: For linear search, if the element is equally likely to be anywhere in the array, the average case is O(n/2) which simplifies to O(n).
d. Amortized Complexity:
• Definition: Amortized complexity refers to the average time per operation over a sequence of operations, rather than per individual operation. This is useful when a costly operation happens infrequently and is balanced out by cheaper operations. • Purpose: It smooths out the cost of expensive operations by averaging over many instances. • Example: In dynamic arrays (like ArrayList), inserting an element might occasionally take O(n) when resizing occurs, but across many insertions, the amortized complexity is O(1).
- Space Complexity Analysis
This measures the amount of memory or storage space an algorithm requires as a function of the input size.
a. Auxiliary Space:
• Definition: It measures the extra space or temporary space used by an algorithm, excluding the input data. • Example: In recursive algorithms, auxiliary space includes the space required for the function call stack.
b. Total Space Complexity:
• Definition: It includes both the space required for the input and the auxiliary space required by the algorithm. • Example: A sorting algorithm like merge sort requires O(n) additional space, whereas quicksort can be done in-place with O(log n) additional space (for recursion).
- Asymptotic Complexity
This refers to the growth of time or space complexity as the input size grows. The focus is on the order of growth rather than precise values. It helps in comparing algorithms for large inputs.
a. Big O Notation (O):
• Definition: It gives an upper bound on the time or space complexity, representing the worst-case scenario. • Example: An algorithm with time complexity O(n^2) means the runtime grows quadratically with the input size.
b. Omega Notation (Ω):
• Definition: It gives a lower bound on the time or space complexity, representing the best-case scenario. • Example: If an algorithm has a best-case time complexity of Ω(n log n), it means the algorithm will take at least n log n time for certain inputs.
c. Theta Notation (Θ):
• Definition: It gives a tight bound, meaning that the algorithm’s time or space complexity is both O(f(n)) and Ω(f(n)), i.e., it grows at exactly the same rate for large input sizes. • Example: An algorithm with time complexity Θ(n) means its running time will grow linearly with the input size, regardless of best or worst case.
- Instance Complexity
This refers to the complexity of solving a particular instance of a problem rather than the general problem itself. It considers the structure of the input and how it influences complexity.
a. Problem-Specific Complexity:
• Definition: Certain problem instances may have lower complexity than the worst-case estimate. • Example: Sorting an already sorted array using bubble sort takes linear time O(n) instead of the worst-case O(n^2).
- Parameterized Complexity
This analyzes complexity with respect to specific parameters of the input, beyond just the input size. This is useful for problems where some part of the input is small, and the complexity can be expressed in terms of that parameter.
a. Fixed-Parameter Tractability (FPT):
• Definition: A problem is fixed-parameter tractable if it can be solved in time f(k) * n^O(1), where k is some parameter, and f(k) is a function depending only on k. • Example: In a graph problem, the parameter could be the number of vertices, and the complexity might grow exponentially with the number of vertices but polynomially with respect to the total input size.
- Exponential Time Complexity
This is when the time complexity grows exponentially with the input size. This type of analysis is typical for algorithms that have exponential growth, like many brute-force algorithms for NP-hard problems.
a. Exponential Time Algorithms:
• Definition: Algorithms that require O(2^n), O(n!), or similar growth, where the runtime grows extremely fast even for small increases in input size. • Example: The traveling salesman problem using a brute-force approach has time complexity O(n!).
Summary:
• Time Complexity: Measures how an algorithm’s runtime grows with input size (best, worst, average cases). • Space Complexity: Measures how an algorithm’s memory usage grows with input size. • Asymptotic Analysis: Provides a big-picture view of the algorithm’s behavior for large inputs using notations like Big O, Omega, and Theta. • Amortized Complexity: Looks at the average cost of operations over a sequence of operations. • Instance Complexity: Focuses on specific instances and how they influence complexity. • Parameterized Complexity: Focuses on certain parameters of the input beyond its size.
By considering these types of complexity analysis, we can better understand an algorithm’s performance across different scenarios and inputs.
What are the different complexity analysis notations and their general idea?
Complexity analysis notations provide a formal way to describe how an algorithm’s time or space requirements grow as a function of input size. The most commonly used notations are Big O, Big Omega (Ω), Theta (Θ), and others. Each notation serves a specific purpose in describing the performance of an algorithm in different cases (e.g., worst-case, best-case, or tight bounds). Here’s an overview of the different notations:
- Big O Notation (O)• Definition: Big O notation describes the upper bound on the growth rate of an algorithm. It expresses the worst-case scenario, where the algorithm takes the longest time to complete or uses the most space for a given input size.
• Purpose: It helps to understand the maximum amount of time or space an algorithm might require as the input size grows.
• Formal Definition: An algorithm is O(f(n)) if there exist constants c > 0 and n0 such that for all n ≥ n0, the running time is less than or equal to c * f(n).
Example: If an algorithm takes at most 3n^2 + 2n + 5 time, we drop lower-order terms and constants, and its time complexity is O(n^2).
• General Idea: Big O notation describes the worst-case growth rate of an algorithm, focusing on the upper limits of performance. - Big Omega Notation (Ω)• Definition: Big Omega describes the lower bound on the growth rate of an algorithm. It represents the best-case scenario, where the algorithm performs at its fastest or uses the least amount of resources.
• Purpose: It provides insight into the minimum amount of time or space an algorithm will require for any input size.
• Formal Definition: An algorithm is Ω(f(n)) if there exist constants c > 0 and n0 such that for all n ≥ n0, the running time is at least c * f(n).
Example: For an algorithm with a time complexity of n^2 in the best case, the lower bound would be Ω(n^2).
• General Idea: Big Omega notation describes the best-case growth rate of an algorithm, focusing on the lower limits of performance. - Theta Notation (Θ)• Definition: Theta notation describes the tight bound on the growth rate of an algorithm. It indicates that the algorithm’s running time or space grows at the same rate in both the best and worst cases.
• Purpose: It gives a more precise description of an algorithm’s efficiency by providing both upper and lower bounds.
• Formal Definition: An algorithm is Θ(f(n)) if it is both O(f(n)) and Ω(f(n)). There exist constants c1, c2 > 0 and n0 such that for all n ≥ n0, the running time is between c1 * f(n) and c2 * f(n).
Example: If an algorithm takes time proportional to 3n + 2 in both best and worst cases, the time complexity would be Θ(n).
• General Idea: Theta notation provides a tight bound, meaning the algorithm’s growth rate is precisely proportional to the function f(n). - Little O Notation (o)• Definition: Little O describes a strict upper bound on the growth rate of an algorithm. It means that the algorithm’s time or space grows slower than f(n) and never reaches it.
• Purpose: It is used to express that an algorithm’s growth is smaller than a certain function asymptotically but not necessarily bounded by that function.
• Formal Definition: An algorithm is o(f(n)) if, for any constant c > 0, there exists an n0 such that for all n ≥ n0, the running time is less than c * f(n).
Example: If an algorithm takes time n log n, it would be o(n^2), meaning it grows slower than n^2 but is not bounded by it in the strict sense of Big O.
• General Idea: Little O notation describes an upper bound that the algorithm’s growth approaches but never reaches. - Little Omega Notation (ω)• Definition: Little Omega describes a strict lower bound on the growth rate of an algorithm. It indicates that the algorithm’s running time or space grows faster than f(n), but not necessarily bounded by it from below.
• Purpose: It is used to express that an algorithm’s growth is larger than a certain function asymptotically.
• Formal Definition: An algorithm is ω(f(n)) if, for any constant c > 0, there exists an n0 such that for all n ≥ n0, the running time is greater than c * f(n).
Example: If an algorithm takes time n^2, it would be ω(n log n), meaning its growth is faster than n log n.
• General Idea: Little Omega notation describes a lower bound that the algorithm’s growth approaches but never dips below.
Summary of Notations:
Notation Type of Bound Describes Purpose
O(f(n)) Upper bound (worst-case) The maximum growth rate of the algorithm Shows the worst-case time/space complexity
Ω(f(n)) Lower bound (best-case) The minimum growth rate of the algorithm Shows the best-case time/space complexity
Θ(f(n)) Tight bound The exact growth rate of the algorithm Gives both upper and lower bounds (exact rate)
o(f(n)) Strict upper bound The algorithm grows slower than the given function Indicates an upper limit that is not tight
ω(f(n)) Strict lower bound The algorithm grows faster than the given function Indicates a lower limit that is not tight
General Idea:
• Big O provides a worst-case upper bound on growth, useful for performance guarantees. • Big Omega gives a best-case lower bound, helping understand the minimal resources required. • Theta describes the exact asymptotic behavior of the algorithm, which is useful when the upper and lower bounds are the same. • Little O and Little Omega are more theoretical, describing bounds that are not tight, showing asymptotic behavior that approaches but doesn’t exactly match the growth rate.
These notations are fundamental tools in analyzing and comparing algorithms, as they help quantify how the algorithm scales with input size.
How can we calculate the asymptotic complexity of an algorithm? What are the methods for doing this?
Calculating the asymptotic complexity of an algorithm involves determining how its running time or space requirements grow relative to the input size as it becomes very large. The most common way to express asymptotic complexity is through Big O notation, which focuses on the upper bound of an algorithm’s growth rate. Here’s an overview of the methods and steps for calculating it:
- Counting Operations and Analyzing Loops• Single Operations: Start by identifying basic operations (like arithmetic operations or assignments) and how many times they run relative to the input size n .
• Loops: For each loop, analyze how many times it iterates. A single for loop iterating n times typically results in O(n) complexity. Nested loops, where an inner loop runs in response to an outer loop, often result in O(n^2) , O(n^3) , or higher.
• Recursive Calls: When recursion is involved, evaluate how the input size reduces with each call. Recurrence relations can often help with this (see below). - Using Recurrence Relations• When an algorithm is recursive, recurrence relations can model its complexity. For example, the complexity of a divide-and-conquer algorithm like merge sort can be represented by the recurrence relation:
T(n) = 2T\left(\frac{n}{2}\right) + O(n)
• Techniques like the Master Theorem provide a straightforward way to solve recurrence relations for common cases, yielding an asymptotic complexity without needing to expand the recurrence fully.
- Master Theorem• The Master Theorem is used to solve recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
• It provides asymptotic bounds based on the values of a , b , and f(n) . For example: • If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided certain conditions on f(n) hold.
- Approximating with Dominant Terms• When an algorithm has multiple terms (e.g., T(n) = 3n^2 + 5n + 100 ), focus on the term that grows fastest as n \to \infty . For example, in T(n) = 3n^2 + 5n + 100 , the n^2 term dominates, so the complexity is O(n^2) .
• Ignore constants and lower-order terms, as they have minimal impact on growth rate. - Using Big-O Rules and Properties• Multiplicative constants are ignored: O(3n) = O(n) .
• Addition of complexities takes the maximum growth rate: O(n^2) + O(n) = O(n^2) .
• For nested loops or recursive calls, multiply the complexities of each part. - Analyzing Best, Worst, and Average Cases• Consider the worst-case scenario unless specified otherwise, as this gives an upper bound on the complexity.
• In some cases, you may also analyze average and best cases, particularly for algorithms like quicksort where the performance varies significantly based on input ordering.
Example: Calculating Complexity for a Simple Algorithm
Consider the following pseudocode:
for i in range(n): # Outer loop
for j in range(n): # Inner loop
print(“Hello”) # Constant-time operation
• The outer loop runs n times, and for each iteration, the inner loop also runs n times. • Since each loop runs independently n times, the total number of operations is n \times n = n^2 . • Therefore, the complexity is O(n^2) .
By following these methods, you can analyze and determine the asymptotic complexity of most algorithms.
What is the divide-and-conquer design paradigm?
The divide-and-conquer design paradigm is a strategy for solving complex problems by breaking them down into smaller, more manageable subproblems, solving each subproblem independently, and then combining the results to solve the original problem. It is a common approach in algorithm design and is particularly useful for tasks that can be efficiently split into similar parts.
The divide-and-conquer approach typically follows three main steps:
1. Divide: The problem is divided into smaller subproblems that are similar to the original but smaller in size. In many cases, the problem is split into two or more equal parts. 2. Conquer: Each subproblem is solved independently. If a subproblem is small enough, it can be solved directly (this is known as the base case of the recursion). For larger subproblems, the divide-and-conquer approach is applied recursively. 3. Combine: The solutions to the subproblems are combined to form the solution to the original problem.
Key Examples of Divide-and-Conquer Algorithms
• Merge Sort: The array is divided into two halves, each half is recursively sorted, and then the two sorted halves are merged together. Merge sort has a time complexity of  because each level of division requires  work for merging, and there are  levels. • Quicksort: The array is partitioned around a pivot element such that elements less than the pivot are on one side and elements greater are on the other. The process is repeated recursively on each partitioned part. Its average-case time complexity is . • Binary Search: The search space is halved each time by comparing the target value to the middle element of a sorted array. This process repeats until the target is found or the search space is empty. Binary search has a time complexity of . • Strassen’s Matrix Multiplication: This algorithm splits matrices into submatrices to reduce the number of multiplications needed, using divide-and-conquer principles to perform matrix multiplication more efficiently than the conventional method.
Advantages of Divide-and-Conquer
• Efficiency: Divide-and-conquer algorithms are often more efficient, especially for large inputs, as they reduce the size of the problem at each recursive level. • Parallelization: Subproblems are often independent and can be solved in parallel, making divide-and-conquer suitable for parallel and distributed computing. • Recursive Nature: Many problems naturally fit a recursive solution, and divide-and-conquer provides a structured way to implement this recursion.
Disadvantages of Divide-and-Conquer
• Overhead: Recursive calls can introduce overhead, especially if there are many calls or if the problem cannot be split evenly. • Memory Usage: Divide-and-conquer algorithms often require additional memory to store subproblem solutions or manage recursive calls, which can be a drawback in memory-constrained environments.
The divide-and-conquer paradigm is a powerful approach, and it has broad applications across sorting, searching, optimization, and computational geometry, among other areas.
Exemplify the calculation for merge sort complexity using the master method
Let’s analyze the time complexity of the merge sort algorithm using the Master Theorem.
- Formulating the Recurrence Relation
The merge sort algorithm works as follows:
1. Divide: Split the array into two halves. 2. Conquer: Recursively sort each half. 3. Combine: Merge the two sorted halves back together.
If we let T(n) represent the time complexity of merge sort for an input of size n , we get the recurrence relation:
T(n) = 2T\left(\frac{n}{2}\right) + O(n)
Here’s what each term represents:
• 2: We make two recursive calls (one for each half). • T\left(\frac{n}{2}\right) : Each recursive call sorts an array of half the original size. • O(n) : The merging step, which takes linear time as it involves comparing elements in each half and combining them into a single sorted array.
Thus, the recurrence relation simplifies to:
T(n) = 2T\left(\frac{n}{2}\right) + n
- Applying the Master Theorem
The Master Theorem can be applied to recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
where:
• a is the number of recursive calls, • \frac{n}{b} is the size of each subproblem, • f(n) is the cost of dividing and combining the subproblems.
In our case:
• a = 2 (two recursive calls), • b = 2 (each subproblem is half the size of the original), • f(n) = O(n) (the cost of merging).
Now we calculate n^{\log_b(a)} :
1. \log_b(a) = \log_2(2) = 1 . 2. So n^{\log_b(a)} = n^1 = n .
- Comparing f(n) to n^{\log_b(a)}
The Master Theorem has three cases based on the growth of f(n) relative to n^{\log_b(a)} :
• Case 1: If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • Case 2: If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • Case 3: If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided a regularity condition holds.
In this case, f(n) = n and n^{\log_b(a)} = n , so f(n) = \Theta(n^{\log_b(a)}) . This matches Case 2 of the Master Theorem.
- Conclusion
Since we are in Case 2, the solution to the recurrence relation is:
T(n) = \Theta(n \log n)
Therefore, the time complexity of merge sort is \Theta(n \log n) .
Exemplify the calculation for binary search complexity using the master method
To analyze the time complexity of the binary search algorithm using the Master Theorem, let’s break down its recurrence relation and apply the theorem.
- Understanding Binary Search
Binary search works by repeatedly dividing a sorted array in half to locate a target value:
1. Divide: The array is split in half. 2. Conquer: Depending on whether the target value is less than or greater than the middle element, binary search proceeds with either the left half or the right half of the array. 3. Combine: There is no need to combine results in binary search since only one half is considered at each step.
Binary search stops when it finds the target value or the subarray size becomes zero (indicating the value is not in the array).
- Formulating the Recurrence Relation
Let T(n) represent the time complexity of binary search for an input of size n . The recurrence relation for binary search is:
T(n) = T\left(\frac{n}{2}\right) + O(1)
where:
• T\left(\frac{n}{2}\right) : Only one recursive call is made on half of the array (either left or right half). • O(1) : The constant time needed to compare the middle element with the target.
Thus, the recurrence relation simplifies to:
T(n) = T\left(\frac{n}{2}\right) + 1
- Applying the Master Theorem
The Master Theorem applies to recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
where:
• a is the number of recursive calls, • \frac{n}{b} is the size of each subproblem, • f(n) is the cost of dividing and combining the subproblems.
For binary search:
• a = 1 (one recursive call), • b = 2 (each recursive call operates on half the array), • f(n) = O(1) (a constant-time comparison).
Now we calculate n^{\log_b(a)} :
1. \log_b(a) = \log_2(1) = 0 . 2. So n^{\log_b(a)} = n^0 = 1 .
- Comparing f(n) to n^{\log_b(a)}
The Master Theorem has three cases based on the growth of f(n) relative to n^{\log_b(a)} :
• Case 1: If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • Case 2: If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • Case 3: If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided a regularity condition holds.
In this case, f(n) = 1 and n^{\log_b(a)} = 1 , so f(n) = \Theta(n^{\log_b(a)}) . This matches Case 2 of the Master Theorem.
- Conclusion
Since we are in Case 2, the solution to the recurrence relation is:
T(n) = \Theta(\log n)
Therefore, the time complexity of binary search is \Theta(\log n) .
Exemplify the calculation for power of a number complexity using the master method
To analyze the time complexity of calculating the power of a number (e.g., x^n ) using an efficient recursive algorithm, let’s consider the algorithm based on exponentiation by squaring. This algorithm reduces the exponent by half with each recursive call, which allows for efficient computation.
Recursive Power Calculation Algorithm (Exponentiation by Squaring)
The recursive algorithm for calculating x^n can be defined as follows:
1. Base Case: If n = 0 , return 1 (since x^0 = 1 ). 2. Divide: • If n is even, calculate x^{n/2} recursively and square the result. • If n is odd, calculate x^{(n-1)/2} recursively, square the result, and multiply by x .
This leads to the following recurrence relation for the time complexity T(n) :
T(n) = T\left(\frac{n}{2}\right) + O(1)
- Formulating the Recurrence Relation
In this recurrence:
• T\left(\frac{n}{2}\right) : We make one recursive call on half the exponent n , which reduces the exponent size by half each time. • O(1) : The constant time needed for the multiplication step (either squaring the result or multiplying by x if n is odd).
Thus, the recurrence relation simplifies to:
T(n) = T\left(\frac{n}{2}\right) + 1
- Applying the Master Theorem
The Master Theorem applies to recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
where:
• a is the number of recursive calls, • \frac{n}{b} is the size of each subproblem, • f(n) is the cost of dividing and combining the subproblems.
For this power calculation:
• a = 1 (one recursive call), • b = 2 (each recursive call operates on half of the exponent), • f(n) = O(1) (a constant-time multiplication operation).
Now we calculate n^{\log_b(a)} :
1. \log_b(a) = \log_2(1) = 0 . 2. So n^{\log_b(a)} = n^0 = 1 .
- Comparing f(n) to n^{\log_b(a)}
The Master Theorem has three cases based on the growth of f(n) relative to n^{\log_b(a)} :
• Case 1: If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • Case 2: If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • Case 3: If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided a regularity condition holds.
In this case, f(n) = 1 and n^{\log_b(a)} = 1 , so f(n) = \Theta(n^{\log_b(a)}) . This matches Case 2 of the Master Theorem.
- Conclusion
Since we are in Case 2, the solution to the recurrence relation is:
T(n) = \Theta(\log n)
Therefore, the time complexity of calculating the power of a number using exponentiation by squaring is \Theta(\log n) .
Exemplify the calculation for matrix multiplication complexity using the master method
If we calculate matrix multiplication using the divide-and-conquer approach but without using Strassen’s technique, we stick to the standard method of matrix multiplication. In this approach, we perform eight recursive multiplications on submatrices rather than the optimized seven in Strassen’s algorithm.
Standard Divide-and-Conquer Approach for Matrix Multiplication
Given two n \times n matrices A and B , we can divide each matrix into four \frac{n}{2} \times \frac{n}{2} submatrices. For matrices A and B :
A = \begin{bmatrix} A_{11} & A_{12} \ A_{21} & A_{22} \end{bmatrix}, \quad B = \begin{bmatrix} B_{11} & B_{12} \ B_{21} & B_{22} \end{bmatrix}
To compute the product C = A \times B , where:
C = \begin{bmatrix} C_{11} & C_{12} \ C_{21} & C_{22} \end{bmatrix}
Each block C_{ij} can be computed as follows:
C_{11} = A_{11}B_{11} + A_{12}B_{21}
C_{12} = A_{11}B_{12} + A_{12}B_{22}
C_{21} = A_{21}B_{11} + A_{22}B_{21}
C_{22} = A_{21}B_{12} + A_{22}B_{22}
This requires 8 recursive multiplications of size \frac{n}{2} \times \frac{n}{2} and some additional O(n^2) work to add the results.
- Formulating the Recurrence Relation
Let T(n) represent the time complexity for multiplying two n \times n matrices using this approach. The recurrence relation is:
T(n) = 8T\left(\frac{n}{2}\right) + O(n^2)
where:
• 8T\left(\frac{n}{2}\right) : We make 8 recursive calls, each on a \frac{n}{2} \times \frac{n}{2} matrix. • O(n^2) : The cost of combining (adding) the submatrices to produce the final result.
- Applying the Master Theorem
The Master Theorem applies to recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
where:
• a is the number of recursive calls, • \frac{n}{b} is the size of each subproblem, • f(n) is the cost of dividing and combining the subproblems.
For standard divide-and-conquer matrix multiplication:
• a = 8 (eight recursive calls), • b = 2 (each subproblem is half the size), • f(n) = O(n^2) (the cost of combining results).
Now we calculate n^{\log_b(a)} :
1. \log_b(a) = \log_2(8) = 3 . 2. So n^{\log_b(a)} = n^3 .
- Comparing f(n) to n^{\log_b(a)}
The Master Theorem has three cases based on the growth of f(n) relative to n^{\log_b(a)} :
• Case 1: If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • Case 2: If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • Case 3: If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided a regularity condition holds.
In this case:
• f(n) = O(n^2) and n^{\log_b(a)} = n^3 . • Since n^2 grows slower than n^3 (i.e., c = 2 < 3 ), this matches Case 1.
- Conclusion
Since we are in Case 1, the solution to the recurrence relation is:
T(n) = \Theta(n^{\log_b(a)}) = \Theta(n^3)
Therefore, the time complexity of matrix multiplication using the standard divide-and-conquer approach (without Strassen’s technique) is \Theta(n^3) . This matches the complexity of the traditional, non-recursive approach to matrix multiplication.
Detail the reason why a conventional divide-and-conquer algorithm do not work for matrix multiplication, them discuss the Strassem idea to solve it
Matrix multiplication complexity can vary depending on the algorithm used. Here, let’s analyze the Strassen’s algorithm for matrix multiplication, which is more efficient than the standard approach for large matrices.
Strassen’s Algorithm for Matrix Multiplication
Strassen’s algorithm reduces the number of multiplications needed to multiply two matrices. For two n \times n matrices A and B , it divides them into four \frac{n}{2} \times \frac{n}{2} submatrices and performs the multiplication using seven recursive multiplications (rather than eight as in the standard method). This leads to the following recurrence relation:
T(n) = 7T\left(\frac{n}{2}\right) + O(n^2)
Here:
• 7T\left(\frac{n}{2}\right) : There are 7 recursive multiplications on submatrices of size \frac{n}{2} \times \frac{n}{2} . • O(n^2) : The cost to combine the submatrices after multiplication.
- Applying the Master Theorem
The Master Theorem applies to recurrence relations of the form:
T(n) = aT\left(\frac{n}{b}\right) + f(n)
where:
• a is the number of recursive calls, • \frac{n}{b} is the size of each subproblem, • f(n) is the cost of dividing and combining the subproblems.
For Strassen’s algorithm:
• a = 7 (seven recursive multiplications), • b = 2 (each subproblem is half the size of the original), • f(n) = O(n^2) (the cost of combining results).
Now we calculate n^{\log_b(a)} :
1. \log_b(a) = \log_2(7) \approx 2.81 . 2. So n^{\log_b(a)} \approx n^{2.81} .
- Comparing f(n) to n^{\log_b(a)}
The Master Theorem has three cases based on the growth of f(n) relative to n^{\log_b(a)} :
• Case 1: If f(n) = O(n^c) where c < \log_b(a) , then T(n) = \Theta(n^{\log_b(a)}) . • Case 2: If f(n) = \Theta(n^{\log_b(a)}) , then T(n) = \Theta(n^{\log_b(a)} \log n) . • Case 3: If f(n) = \Omega(n^c) where c > \log_b(a) , then T(n) = \Theta(f(n)) , provided a regularity condition holds.
In this case:
• f(n) = O(n^2) and n^{\log_b(a)} \approx n^{2.81} . • Since n^2 grows slower than n^{2.81} (i.e., c = 2 < 2.81 ), this matches Case 1.
- Conclusion
Since we are in Case 1 of the Master Theorem, the solution to the recurrence relation is:
T(n) = \Theta(n^{\log_b(a)}) = \Theta(n^{2.81})
Therefore, the time complexity of Strassen’s algorithm for matrix multiplication is \Theta(n^{2.81}) , which is more efficient than the standard matrix multiplication algorithm with \Theta(n^3) complexity for large matrices.
What is a machine model? Why do we need them?
A machine model Describes a “machine. Puts a value to the operations on the
machine
It helps us:
- Makes it easy to reason algorithms
- Achieve complexity bounds
- Analyzes maximum parallelism
What are Random Access Machines? How do we calculate time and space complexity in them?
Some characteristics of RAMs are:
UNBOUNDED NUMBER OF LOCAL MEMORY
CELLS
EACH MEMORY CELL CAN HOLD AN INTEGER OF UNBOUNDED SIZE
INSTRUCTION SET INCLUDES SIMPLE
OPERATIONS, DATA OPERATIONS, COMPARATOR, BRANCHES
ALL OPERATIONS TAKE UNIT TIME
We compute time and spacial complexity by:
TIME COMPLEXITY = NUMBER OF
INSTRUCTIONS EXECUTED
SPACE COMPLEXITY = NUMBER OF MEMORY CELLS USED
What are Parallel Random Access Machines? How do we calculate time and spacial complexity of a program running in this system?
IS AN ABSTRACT MACHINE FOR DESIGNING THE ALGORITHMS
APPLICABLE TO PARALLEL COMPUTERS
M’ IS A SYSTEM <M, X, Y, A> OF INFINITELY MANY:
RAM’S M1, M2, …, EACH MI IS CALLED A PROCESSOR OF M’. ALL THE
PROCESSORS ARE ASSUMED TO BE IDENTICAL. EACH HAS ABILITY TO
RECOGNIZE ITS OWN INDEX I
INPUT CELLS X(1), X(2),…,
OUTPUT CELLS Y(1), Y(2),…,
SHARED MEMORY CELLS A(1), A(2),…,
Other important characteristics of PRAMs:
UNBOUNDED COLLECTION OF SHARE
MEMORY CELLS
ALL PROCESSORS CAN ACCESS ALL
MEMORY CELLS IN UNIT TIME
ALL COMMUNICATION VIA SHARED
MEMORY
What are the steps in a PRAM computation?
CONSIST OF 5 PHASES (CARRIED IN PARALLEL BY ALL THE PROCESSORS) EACH PROCESSOR:
◦ READS A VALUE FROM ONE OF THE CELLS X(1),…, X(N)
◦ READS ONE OF THE SHARED MEMORY CELLS A(1), A(2),…
◦ PERFORMS SOME INTERNAL COMPUTATION
◦ MAY WRITE INTO ONE OF THE OUTPUT CELLS Y(1), Y(2),…
◦ MAY WRITE INTO ONE OF THE SHARED MEMORY CELLS A(1), A(2),…
What is a write conflict? How can we classify PRAMs according to read/write operations?
A write conflict occurs when two or
more processors try to write
simultaneously into the same cell
PRAM ARE CLASSIFIED BASED ON THEIR READ/WRITE ABILITIES (REALISTIC AND USEFUL):
◦ EXCLUSIVE READ(ER) : ALL PROCESSORS CAN SIMULTANEOUSLY READ FROM DISTINCT MEMORY LOCATIONS
◦ EXCLUSIVE WRITE(EW) : ALL PROCESSORS CAN SIMULTANEOUSLY WRITE TO DISTINCT MEMORY
LOCATIONS
◦ CONCURRENT READ(CR) : ALL PROCESSORS CAN SIMULTANEOUSLY READ FROM ANY MEMORY LOCATION
◦ CONCURRENT WRITE(CW) : ALL PROCESSORS CAN WRITE TO ANY MEMORY LOCATION
◦ EREW, CREW, CRCW
The concurrent write PRAMs can be further divided in:
• PRIORITY CW: PROCESSORS HAVE PRIORITY BASED ON WHICH VALUE IS
DECIDED, THE HIGHEST PRIORITY IS ALLOWED TO COMPLETE WRITE
• COMMON CW: ALL PROCESSORS ARE ALLOWED TO COMPLETE WRITE IFF ALL THE VALUES TO BE WRITTEN ARE EQUAL.
• ARBITRARY/RANDOM CW: ONE RANDOMLY CHOSEN PROCESSOR IS
ALLOWED TO COMPLETE WRIT
What are some definitions we are going to use?
T*(n) is the time to solve problem of input size n on one processor, using the best sequential algorithm
Tp(n) is the time to solve the problem on p processors
SUp(n) is the speed up in p processors. It is equal to the time to solve a problem of size n on one processor using the sequential divided by the time to solve the same problem parallel on p processors
In PRAM (Parallel Random Access Machine) models, efficiency is a measure of how well parallel resources are used relative to a single processor. It’s defined as the ratio of the work done by one processor to the work done when multiple processors are used. Specifically, we look at:
\text{Efficiency} = \frac{T_1}{p \times T_p}
where:
• T_1 is the time taken to complete the task with a single processor, • T_p is the time taken with p processors, • p is the number of processors used.
Explanation:
Efficiency quantifies the productivity of each processor in the parallel system:
• If efficiency is close to 1, the processors are being used effectively, with minimal idle time or overhead. • If efficiency is much less than 1, it suggests that the additional processors aren’t being fully utilized, often due to communication overhead, synchronization delays, or other bottlenecks.
In the PRAM model, achieving high efficiency involves minimizing communication costs and balancing workloads among processors.
In PRAM models, work is a measure of the total computational effort, representing the sum of operations performed by all processors. It’s calculated as:
\text{Work} = p \times T_p
where:
• p is the number of processors, • T_p is the time taken to complete the task with p processors.
Explanation:
• Work quantifies the total amount of computational steps performed, regardless of how they are distributed among processors. • Ideally, the work with multiple processors should approach T_1 , the time required by a single processor, as closely as possible. When the work with p processors (i.e., p \times T_p ) is close to T_1 , the parallel algorithm is said to be work-efficient.
In PRAM analysis, we aim to minimize both work and time while maintaining high efficiency.
Why T* is different than T1? What could it indicate?
Sequencial and parallel programs are different. It could indicate the overhead the parallel program is going to add.
Analyze a matrix-vector multiplication. How do we partition the matrix? Take the example of 32 processors to do a (256x256, 256) matrix multiplication.
The 256x256 matrix can be decided into 32 blocks of 8x256. Each processor reads a block and the 256 vector and computes its own part of the final result.
• STEP 1: CONCURRENT READ OF X(1:N)
• BEST SUPPORTED BY B-CAST?
• TRANSFER N ELEMENTS
• STEP 2: SIMULTANEOUS READS OF
DIFFERENT SECTIONS OF A
• TRANSFER 𝑛^2/𝑝 ELEMENTS TO EACH
PROCESSOR
• STEP 3: COMPUTE
• COMPUTE 𝑛^2/𝑝 OPS PER PROCESSOR
• STEP 4: SIMULTANEOUS WRITES
• TRANSFER 𝑛/𝑝 ELEMENTS FROM EACH
PROCESSOR
• NO WRITE CONFLICTS
What is the general view of a MVM algorithm applied to Matrix-vector multiplication. What is its performance?
On one processor O(n^2)
On p processors O(n^2/p)
Cost = O(p * n^2/p) = O(n^2)
Efficiency = T1/(pTp) = n^2/(pn^2/p) = 1
What is the general view of SPMD SUM algorithm? What is its performance?
Sure, let’s dive into the SPMD (Single Program Multiple Data) model, especially in the context of computing the sum of elements in an array A of size N on a PRAM (Parallel Random Access Machine).
SPMD Model Overview
In the SPMD model, multiple processors execute the same program but on different portions of the data. Each processor works on its designated section, which helps parallelize the task efficiently.
PRAM Model
The PRAM model is a theoretical framework for designing parallel algorithms. It assumes an idealized scenario where:
• Multiple processors operate in parallel.
• They have shared memory access.
• Memory access is instantaneous (no delays), which allows all processors to read and write concurrently.
PRAM models can vary by how they handle concurrent read and write operations. Common types include:
• EREW (Exclusive Read Exclusive Write) – no two processors can access the same memory cell at the same time.
• CREW (Concurrent Read Exclusive Write) – multiple processors can read the same cell but cannot write to the same cell simultaneously.
• CRCW (Concurrent Read Concurrent Write) – both reads and writes can occur simultaneously, though specific rules determine conflict resolution.
Parallel Sum on PRAM
To sum an array A(1:N) using PRAM, we can exploit the SPMD model to split the work across multiple processors:
1. Initialization: Suppose we have p processors. We divide the array A into p chunks, with each processor responsible for a portion of A .
2. Local Sum Computation: Each processor calculates the sum of its assigned portion independently. Let’s say processor i is assigned the sub-array A(i \cdot N/p : (i+1) \cdot N/p - 1) . Each processor calculates a local sum of its sub-array.
3. Global Sum Reduction:
• Once all processors have their local sums, they participate in a parallel reduction operation to combine these results into a single global sum.
• This reduction can be done in logarithmic time by halving the number of active processors in each step. For example:
• In the first step, processor 1 adds its sum to processor 0, processor 3 adds its sum to processor 2, and so on.
• In the next step, processor 2 adds its accumulated sum to processor 0, and so forth, until one processor holds the final sum.
Efficiency and Complexity
On an ideal PRAM:
• The local sums calculation has a time complexity of O(N/p) since each processor independently processes its portion of A .
• The reduction phase has a time complexity of O(\log p) , as the number of active processors is halved in each step.
The overall time complexity, therefore, becomes O(N/p + \log p) .
Summary
This SPMD SUM on a PRAM model allows us to:
• Divide the array into parts, each processed by a different processor.
• Use a reduction to combine partial sums efficiently.
This approach is efficient on large arrays and a high number of processors, as it balances the workload across processors and minimizes the final summation time.
Describe the MM algorithm.
Matrix multiplication on the PRAM model can be efficiently parallelized by taking advantage of multiple processors working in parallel on different parts of the matrices. Let’s go through the general approach to perform matrix multiplication on a PRAM.
Problem Setup
Suppose we have two matrices:
• Matrix A of size n \times n
• Matrix B of size n \times n
The goal is to compute their product C = A \times B , where C will also be an n \times n matrix, and each element C[i][j] is calculated as:
C[i][j] = \sum_{k=1}^n A[i][k] \cdot B[k][j]
PRAM Parallel Matrix Multiplication Approach
On a PRAM, we can use multiple processors to calculate each element of the resulting matrix C in parallel. The PRAM model’s shared memory allows processors to access elements in A and B concurrently, which makes parallelization straightforward.
Key Steps in Parallel Matrix Multiplication
1. Assign Processors to Elements in C : • We assume that we have n^2 processors available, one for each element in C . So, processor P_{i,j} will be responsible for computing C[i][j] . 2. Parallel Computation of Each Element C[i][j] : • Each processor P_{i,j} computes C[i][j] by iterating through the elements in row i of A and column j of B in parallel. • For each element A[i][k] and B[k][j] , the processor multiplies the values, accumulates the result, and stores it in C[i][j] . 3. Implementation in Steps: • Each processor P_{i,j} does the following: 1. Initialize C[i][j] = 0 . 2. FOR k = 1 to n : • GLOBAL READ A[i][k] and B[k][j] • Compute the partial product: \text{partial\_sum} = A[i][k] \cdot B[k][j] • Add to the accumulator: C[i][j] = C[i][j] + \text{partial\_sum} • Once the loop is completed, C[i][j] holds the correct value. 4. Writing the Result: • After completing the computation for each C[i][j] , the result is stored in the shared memory location representing matrix C .
Complexity Analysis
For an n \times n matrix multiplication on an ideal PRAM with n^2 processors:
• Time Complexity: The multiplication of each pair A[i][k] and B[k][j] and the subsequent addition take constant time O(1) .
• The summation over n elements for each C[i][j] can be completed in O(n) .
• Thus, the overall time complexity is O(n) for matrix multiplication on a PRAM model with n^2 processors.
Variations for Fewer Processors
If we have fewer than n^2 processors (say p < n^2 ), we can still parallelize the operation by dividing the work across processors, but this would increase the total runtime. Each processor would handle multiple elements of C , leading to a trade-off between processor count and execution time.
Summary
Matrix multiplication on a PRAM can be efficiently parallelized by:
1. Assigning each processor to compute one element in the result matrix C .
2. Using parallel access to read values from A and B .
3. Performing concurrent partial multiplications and summations to obtain each C[i][j] .
This approach leverages the PRAM model’s shared memory to achieve high parallelism and reduce computation time, making matrix multiplication feasible for large matrices in parallel computing environments.
How can we define performance?
Reducing the total time it takes to compute a single result (latency)
Increase the rate at which a series of results can be computed.
Reduce the power consumption of a computation.
What does the Amdahl Law says?
It says that the execution time of a program falls into two categories:
- time spent doing non-parallelizable work (Wser)
- time spent doing parallelizable work (Wpar)
Following this logic we can calculate the speed up we can achieve by increasing the number of processors:
Sp <= (Wser+Wpar)/(Wser+(Wpar/num of processors))
And if we consider “f” as the fraction of the program non-parallelizable we get:
Sp <= (1)/(f+(1-f)/p)
This leads us to conclude that the speed up is limited by the non-parallelizable work, even using an infinite number of processors.