Final Flashcards by sophie courville

What is a sparse vs dense graph?

|E| ~ |V^2| = dense
|E| < |V^2| = sparse

*E = O(|V^2|)

How well did you know this?

Not at all

Perfectly

What are 2 ways of representation of a graph?

Adjacency List:
- Array “Adjacency” of |V| lists → in the slot of the vertex a, linked list of all vertices a connects to (edges)
- 1 list/vertex
- Can store weights in the adjacency lists
- Good for sparse graphs
- Good when frequently add/remove vertices or need to traverse the graph
- Not good in determinig if an edge(u,v) exists

Adjacency Matrix:
- |V| x |V| matrix → 1 is exists an edge, 0 if not (can also store the weight of the edge directly)
- For undirected graphs, A = A^T
- Good for adding and removing edges, NOT vertices
- Good to check presence/absence of an edge
- NOT sparse-efficient

Both have a running time of O(V+E)

How well did you know this?

Not at all

Perfectly

What are the main features of the BFS algorithm for graph traversal?
(input, ouput, DS)

Input: graph G = (V, E) (directed or not) and a source node

Output:
d[v] = distance from s to all vertices v, v = inf. if not reachable
pi[v] = u such that (u, v) is the last edge on the shortes path s → v

Algorithm:
- Uses a queue to determine next vertex to visite (radial, priority = distance from the source)
- Node is gray when put into the priority queue
- Node is black when poped out of pq + visited
- Node is white if not visited yet (inf distance in d[v])

How well did you know this?

Not at all

Perfectly

What are the main features of the DFS algorithm for graph traversal?
(input, ouput, DS)

Input: graph G = (V, E) (directed or not), NO source vertex

Output:
- d[v] = discovery time (v turns from white to gray)
- f[v] = finishing time (v turns from gray to black when added to f[v])
- pi[r] = predecessor o v = u, such that v was discovered during the scan of u’s adcacency list

Algorithm:
- Uses a stack

*Often subroutine of another algorithm

How well did you know this?

Not at all

Perfectly

In DFS, what are Tree, Back, Forward and Cross edges?

Tree edge → in the depth-first forest. Found by exploring (u, v), discovers a new/white node
Back edge → (u, v), where u is a descendant of v (in the depth-first tree).
Forward edge → (u, v), where v is a descendant of u, but not a tree edge. (finds closed node)
Cross edge → any other edge. Can go between vertices in same depth- first tree or in different depth-first trees.

*Based on the color when being discovered

How well did you know this?

Not at all

Perfectly

What type of graph traversal is used in the 3 following cases?
1. Topological Sort
2. Shortest Path
3. Minimum Spanning Tree

Topological Sort → Depth-First (Stack)
Shortest Path → Breadth-First (Queue)
Minimum Spanning Tree → Best-First (Priority Queue)

How well did you know this?

Not at all

Perfectly

What is a directed acyclic graph (DAG)?
What lemma characterizes a DAG?

Can be used to encode precedence relations or dependencies in a natural way
Source with no incoming edges
Sink with no outgoing edges
*May have more than 1 of each

Lemma 1 → A directed graph G is acyclic iff a DFS of G yields no back edges (proof shows that a back edge leads to a cycle + a cycle implies a back edge, prove both ways)

How well did you know this?

Not at all

Perfectly

What is a BFS-based approach to the Topological Sort algorithm?
What is its running time?
*Kahn’s algorithm

*Made on DAG → sorts elements in a natural way
*Version of BFS

Start with source node (inDegree = 0), 1st in ordering / Compute all inDegrees and add all the inDegrees=0 into the queue (v)
Delete v from G → safe since can’t create any cycles
- Decrease in-degree by 1 of all its neighbouring nodes, if inDegree=0, then add to the queue
Recursively compute topological ordering of G - {v}
- Repeat step 2 until the queue is empty

Running time:
Computing in-degree → O(V+E)
Rest of algortihm → O(V+E)

How well did you know this?

Not at all

Perfectly

What is a DFS-based approach of Topological Sort?
What is the running time?

Call DFS(g0 to compute finishing times f[v] for all v e V
As each vertex is finishes, insert it onto the front of a linked list (representing the topological sort graph)
Return the linked list of vertices

Time = O(V+E)

How well did you know this?

Not at all

Perfectly

What edges are found in DFS of a connected undirected graph?

Tree edges and Back edges
- No forward or cross edges

How well did you know this?

Not at all

Perfectly

What is the definition of a Strongly Connected Component?
What is a Component graph (what type of graph is it always)?

A strongly connected component (SCC) of G is a maximal set of vertices C within V such that for all u, v e C, both u → v and v → u exist.

*In directed graph

A component graph is a graph where each vertex represents a strongly connected component → Always a DAG (otherwise the 2 vertices are in the same SCC)

How well did you know this?

Not at all

Perfectly

What are common features of G and its transpose? How is the transpose of G made?

What is the running time of creating G^T from G when using an adjacency lists vs adjacency matrix?

G^T has all the same vertices, but the edges are reversed
- G and G^T have the same SCCs

Can create G^T in O(V+E) using adjacency lists

*Adjancency matrix would be V^2 as you need to scan the VxV slots, not efficient for sparse graphs

How well did you know this?

Not at all

Perfectly

What algorithm allows to determine SCCs?

Kosaraju’s Algorithm:
1. call DFS(G) to compute finishing times f [u] for all u
2. compute GT
3. call DFS(GT), but in the main loop, consider vertices in decreasing order of finishing time (f [u] as computed in first DFS)
4. Each DFS tree in step 3 is a SCC

Running time = O(V+E)
*2 DFS, 1 Transpose

Idea:
- By considering vertices in second DFS in decreasing order of finishing times from first DFS, we are visiting vertices of the component graph in topologically sorted order.
- Because we are running DFS on GT, we will not be visiting any v from a u, where v and u are in different components.

How well did you know this?

Not at all

Perfectly

How does SCCs relate to DFS finishing times?

Let U and V be distinct SCC’s in G = (V, E).
Suppose there is an edge (u → v) e E such that u e U and v e V. Then f (U) > f (V).
i.e. The finishing time of any node in U is greater than the fishing time of any node in V.
*Close a node only after all descendants are closed

Consider U and V as single nodes of the SC graph (DAG)

How well did you know this?

Not at all

Perfectly

What is the Ford Flukerson Algorithm?
What is the running time?

Ford-Fulkerson algorithm → method for finding the maximum flow in a flow network. Works by iteratively finding paths (augmenting paths) in the network with available capacity and pushing flow along these paths. The algorithm continues until no more augmenting paths can be found, at which point the maximum flow is reached.

*Augmenting path is a path from s → t in the residual graph
- Flow added is the smallest capacity in the residual graph along that path

Running Time = O (C*|E|) where C is the sum of capacities of all outgoing edges from the source (max network capacity) as the flow increase by at least 1 at every iteration

How well did you know this?

Not at all

Perfectly

What are some variants of the Shortest path problem?

**Based on BFS

Single source (SSSP) → Find shortest paths froma given source vertex

Single-destination → Find shortest paths to a given destination vertex
- SSSP, but reversing the direction of each edge

Single-pair → Find shortest path from u to v
- SSSP solves this
- MTL → Cancun shortest path

All-pairs → find shortest path from u to v for all u, v e V
- By running a SSSP algorithm once, each vertex, but we can solve it faster
- Get to Cancun by shortest path, can start anywhere

How well did you know this?

Not at all

Perfectly

What are conditions to the graph on which we perform the shortest path algorithm?

No negative weight cycles → will go around until -inf.

*Shortest path contains NO cycles
- negative-weights → ruled out
- positive-weights → get shorter path by omitting the cycle
- Zero-weights → no reason to use them, assume don’t use them
Shortest path has at most |V| - 1 edges (DAG)

How well did you know this?

Not at all

Perfectly

What is the SSSP algorithm?
Is it based on BFS or DFS?

Single-Path shortest path allows to find all shortest paths from a given source node to all other nodes

*Based on BFS
*Relax edges

How well did you know this?

Not at all

Perfectly

What are properties of SSSP for DAGs?
What is the trick to solve this easily?

DAG implies no negative-weight cycles
- DAGs have minimum 1 source and 1 sink because if not, would have cycles (can have more)
- Can have negative-weights because no cycles
- Topological sorting allows to find SSSP easily

Run time = O(V + E)

How well did you know this?

Not at all

Perfectly

What is Dijkstra’s algorithm?

It is an algorithm to find the Shortest Path form a single source to all vertices (SSSP)
*No negative weighted cycles

Add all edges to the PQ (most set to inf. if not adjacent to source)
While the PQ is not empty, extract the min → add it to S
*Greedy choice: at each step, choose the light edge
For each vertex v adjacent to u, relax the edge if tensed

Ouputs d[u] → an array of shortest paths from source to every vertex

*weighted version of BFS

How well did you know this?

Not at all

Perfectly

What is the run time of Dijkstra’s algorithm?

Depends on implementation of the priority queue

For binary heap, each operation takes O(log v) time → O(E log V)

How well did you know this?

Not at all

Perfectly

What type of graph does Topological search, Dijkstra’s, Minimum Spanning Tree, Ford-Fulkerson, Kurskal and Prim’s algorithms apply to?

Study These Flashcards

Topological Sort → DAG
Dijkstra’s → Directed or undirected, no negative weight edges
Minimum Spanning Tree → Undirected (can have neg weight cycles)
Ford-Fulkerson → Directed graph
Kurskal’s Algorithm → Undirected weighted edges
Prim’s Algorithm → Undirected weighted edges

What is Kurskal’s Algorithm?

Study These Flashcards

It is an algorithm for solving MST for sparse graphs
- Start with each nodes as separated disjoint set
- Chose lightest edge
- Check that they are in different disjoint sets (Union-Find)

Run Time = O(O(Ea(V)) + O(E log E) = O (E log V)

What data structure and running time correspond to Prim’s vs Kurskal’s algorithm?
When do we use each of these?

Study These Flashcards

Kurskal → Disjoint Sets
- O( E log E) ~ O(E log V)
- For sparse graphs

Prim’s → Priority Queue
- O (E log V)
- For dense graphs

In Prim's algorithm, what are the time complexities of the following? - Using binary heap - Building initial queue - Initialization - V Extract-Min - E Decrease-Key

- Using binary heap → O(E log V) - Building initial queue → O(V) - Initialization → O(V) *All to inf. - V Extract-Min → O(V log V) - E Decrease-Key → O(E log V)

What condition garantees we do not have a bipartite graph?

If it contains an odd cycle, it can't be bipartite *No edges within a set *Can perform 2 coloring of your graph to garantee

Wat algorithm underlies the coloring vertices to check if bipartite graph?

Run DFS → color nodes → check edges that are not in the DFS tree and their end points

What is the 1st thing to check when looking at correctness of an algorithm?

Check the loop terminates!!! *Use contradiction in most cases for maintenance

Which is assymptotically bigger between sqrt(n) and log n?

sqrt(n) > log n

Which of the following statements about Dijkstra’s Algorithm is false? A) Dijkstra's algorithm starts by setting all vertices to a weight of infinity, except for the source node with a weight of 0 B) The algorithm updates the distances of vertices to their neighbors based on the smallest known distance C) Once a vertex has been marked as fully visited, it is guaranteed that no shorter path to that vertex can be found D) The algorithm is guaranteed to find the correct shortest path in graphs with negative edge weights

D) The algorithm is guaranteed to find the correct shortest path in graphs with negative

Which of the following statements about amortized analysis is true? A) Amortized analysis calculates the time complexity of a single operation by averaging over a sequence of operations B) In amortized analysis, the total cost of a sequence of operations is estimated by the sum of the individual worst-case costs of each operation C) Amortized analysis is used to determine the exact time complexity of a single operation in a data structure D) The amortized cost of an operation is always equal to its worst-case cost

A) Amortized analysis calculates the time complexity of a single operation by averaging over a sequence of operations B is false because it is not worst-case analysis, it is amortized analysis - Amortized analysis spreads out the cost of expensive operations over multiple cheap ones, rather than summing all worst-case times. Example: - Most cost insertions are O(1), but occasionally resizing causes costly O(n) operations → on average, amortized cost it still O(1)/operation

True or False? When running BFS, a node has color gray while that node is waiting on the Queue.

True

True or Flase? DFS finds a back-edge in a directed graph if and only if an outoging edge while traversing leads to a black node.

False (to gray node)

True or False? In a flow network, the value of the maximum flow from the source to the sink can never exceed the sum of the capacities of all edges leaving the sink. [Consider the Max-Flow Min-Cut theorem and flow conservation]

True

True or False? The resulting graph from connecting two nodes, one from each of the two bipartite components, is also bipartite.

True

True or False? When creating an MST, it is possible to add an edge which connects two vertices inside a connected component.

False

True or False? It is possible to run a single source shortest path algorithm on a graph which contains negative edge cycles.

True, but not Dijkstra, use Bellman-Ford instead → detects negative weights cycles * Performs V - 1 passes over all edges (longest possible path without cycles). After those passes, performs 1 more pass over all edges, and if still can relax them, then negative weight cycle

True or False? Let G = (V,E,w) and G' = (V,E,w') be positively weighted un-directed graphs with the same vertices V and edges E. If for any edge e ∈ E, we have w'(e) = w'(e)2 , then, for any two vertices u, v ∈ V, any shortest path between u and v in G' is also a shortest path in G.

False

What is big-O assymptotic running time of the following cases: 1) Creating a heap of n elements. 2) Pre-order traversal of an AVL tree with n^2 elements. 3) Deleting the minimum element in a min-heap with n elements. 4) Complexity of inserting n elements in an originally empty red-black tree. 5) Worst time complexity for inserting a value in a hash table, that handles collisions by chaining, if before inserting the value no prior collisions have occurred.

1) Creating a heap of n elements. → O(n) 2) Pre-order traversal of an AVL tree with n^2 elements. → O(n^2) 3) Deleting the minimum element in a min-heap with n elements. → O(log n) - Swap the root with the last element and heapify 4) Complexity of inserting n elements in an originally empty red-black tree. → O(n log n) - Add each element n elements takes in the worst case log n for each 5) Worst time complexity for inserting a value in a hash table, that handles collisions by chaining, if before inserting the value no prior collisions have occurred. → O(1)

What is pre-order vs in-order vs post-order traversal?

Pre-order: root-left-right In-order: left-root-right Post-order: left-right-root

Which of the following statements about dynamic programming is FALSE? A) Memoization involves solving subproblems on demand, while tabulation solves all subproblems in a bottom-up manner B) Memoization uses a recursive approach, while tabulation uses an iterative approach to solve the problem C) Tabulation may lead to stack overflow if the problem has a deep recursion tree, while memoization avoids this by solving problems top-down D) Memoization and tabulation can optimize problem solving by avoiding redundant calculations

FALSE statement is: C) Tabulation may lead to stack overflow if the problem has a deep recursion tree, while memoization avoids this by solving problems top-down Tabulation is iterative, not recursive

In DFS, when following the edge from u → v, what type of edge is it if v is WHITE? GRAY? BLACK?

u → v v is white: tree edge v is gray: back edge (going back to an ancestor) v is black: forward or cross edge

What situations/algorithm do these problem relate to? - Order in which a goaler puts its equipment - How many soccer balls can you send to Calgary from Mtl? - Find groups/social networks where every one can reach each other - Fastest/Cheapest road to got from Mtl → Boston - Minimum roads/cost to repair so everyone stays connected - Assign an intern to a compagnie

- Order in which a goaler puts its equipment → Topological search (DAG) - How many soccer balls can you send to Calgary from Mtl? → Network Flow - Find groups/social networks where every one can reach each other → Strongly Connected components - Fastest/Cheapest road to got from Mtl to Boston → SSSP - Minimum roads/cost to repair so everyone stays connected → MST - Assign an intern to a compagnie → Bipartite graph

What is the white path theorem?

v is descendant of u if and only if at the time d[u], there is a path u → v consisting of only white verticies (except for u colored gray)

What is the definition of a simple path?

A path such that all vertices are distinct

What is the amortized running time of Union-Find with path compression?

O(a(n)) A sequence of m elements incorporated is O( m*a(n))

What is the definition of amortized analysis?

We will talk about average performance of each operation in the worst case (i.e. not averaging over a distribution of inputs. No probability!) - Not like hashing (probability) - Identify the total work of the algorithm and divide by number of steps

What are the 3 methods of amortized analysis?

1. Aggregate analysis → Show that although some individual operations may be expensive, on average the cost per operation is small. Ex: Multipop, 2. Accounting method 3. Potential method

What is the aggregate analysis method of amortized?

A sequence of n operations takes worst-case time T(n) in total. In the worst case, the average cost, or amortized cost, per operation is therefore T(n)/n. - Note that this amortized cost applies to each operation, even when there are several types of operations in the sequence. - Although some operations might be expensive, many others might be cheap. Ex: 26 hours during the term (13 weeks) . You spend an average of 2 hours per week on assignments. T(n) = 26, T(n)/n = 2.

How is the k-bit binary counter an example of amortized?

To increase the bit counter, traverse while flipping bits from 1 → 0 until you find the 1st zero and flip that bit to 1 - Least sig bit flips every time (n times) - 2nd least sig flips 1/2 x (n/2 times) - 3rd least sig flips 1/2^2 x (n/4 times) Cost of increment = O(# of bits flipped) - Worst case, flip k bits, so n increments takes O(nk) time → correct but not tight enough But average cost/operation = O(1) → n increment costs O(n)

What is the accounting method of amortized? Explain the multi pop method.

Differs from aggregate analysis: - In the aggregate analysis, all operations have same cost. - In accounting method, different operations can have different costs. Push → actual cost = 1 / amortized cost = 2 *Put money in the bank for when you will pop that element bc every pushed element will be popped (max) Pop → actual cost = 1 / amortized cost = 0 Multipop → actual cost = min(k,s) / amortized cost = 0 Total amortized cost = O(n)

Wht is the amortized cost of k-bit-counter increment using the accounting method?

- Cost of resetting bits to 0 is paid by credit - At most 1 bit is set to 1 → amortzed cost <= 2$ For n operations, amortized cost = O(n)

What is the amortized cost of dynamic tables? What is the cost of filling a dynamic table?

Insert until the table is full, when full → create a new table with size 2*m → copy all elements into that new table + insert the new element (O(n)) naïve run time = O(n^2) Aggregate analysis: Amortized cost/operation → insertion + moving to next table + for moving a friend = 3

Suppose we perform a sequence of stack operations on a stack whose size never exceeds k. After every k OPERATIONS, we make a copy of the entire stack for backup purposes. Show that the cost of n stack operations, including copying the stack, is O(n) by assigning suitable amortized costs to the various stack operations.

Charge $2 for each PUSH and POP operation and $0 for each COPY. When we call PUSH, we use $1 to pay for the operation, and we store the other $1 on the item pushed. When we call POP, we again use $1 to pay for the operation, and we store the other $1 in the stack itself. Because the stack size never exceeds k, the actual cost of a COPY operation is at most $k, which is paid by the $k found in the items in the stack and the stack itself. Since there are k PUSH and POP operations between two consecutive COPY operations, there are $k of credit stored, either on individual items (from PUSH operations) or in the stack itself (from POP operations) by the time a COPY occurs. Since the amortized cost of each operation is O(1) and the amount of credit never goes negative, the total cost of n operations is O(n).

Final Flashcards

(54 cards)