Graph Agos Flashcards

1
Q

How to get connected components in undirected graph G

A

Run DFS and keep track of component #

2
Q

DFS inputs and outputs

A

input: G=(V,E) in adjacency list representation

Outputs:
prev[z]: The parent index of vertex z in the DFS visitation
pre[z]:The pre number of the vertex z in the DFS visitation
post[z]: The post number of vertex z in the DFS visitation
ccnum[z]:The connected components number IF vertices are passed in as highest post number

lowest number after running DFS on a reversed directed graph that is processed this way, the ccnum also be the topological sort order in reverse

3
Q

DFS algorithm

A
```DFS(G):
cc = 0
for all v in V, visited(v) = FALSE
for all v in V
if not visited(v) then
c++
explore(v)```
4
Q

DFS Explore

A
```Explore(z)
ccnum(z) = cc
visited(z) = TRUE
for all (z,w) in E:
if not visited(w)
Explore(w)```
5
Q

Running time for DFS

A

O(n+m)

6
Q

How to find a path between connected vertices?

A

We add an initialization of prev(v) = NULL to the base case for DFS and in Explore we set prev(w)=z after running Explore(w)

We use the prev array to backtrack. For a pair of certices in the same connected component, we can use the prev array to find a path between this pair of connected vertices.

7
Q

How can we get connectivity info for directed G?

A

use dfs: add pre/postorder numbers for the tree or forest of explored edges

8
Q

DFS on directed graphs

A
```DFS(G):
clock =1
for all v in V, visited(v) = FALSE
for all v in V
if not visited(v) then
explore(v)```
9
Q

Explore(z) for directed graphs

A
```Explore(z)
pre(x) = clock; clock++
visited(z) = TRUE
for all (z,w) in E:
if not visited(w)
Explore(w)
post(z) = clock, clock++```
10
Q

Do we use postorder numbers or pre order numbers for directed graphs?

A

postorder

11
Q

How is a graph represented?

A

We can represent a graph by an adjacency matrix; if there are n = |V | vertices v1, . . . , vn, this is an n × n array whose (i, j)th entry is
aij =

1 if there is an edge from vi to vj

0 otherwise.

For undirected graphs, the matrix is symmetric since an edge (u,v) can be taken in any direction.

12
Q

Tree edges

A

Part of the forest. post(z) > post(w)

13
Q

Forward Edges

A

lead from a node to a nonchild descendant in the DFS tree. Goes down multiple steps. post(z) > post(w)

14
Q

Back Edges

A

lead to an ancestor in the DFS tree. The post order number of post(z) < post(w)

15
Q

Cross Edges

A

lead to neither descendant not ancestor; they therefore lead to a node that has already been completely explored. post(z) > post(w)

16
Q

Cycle

A

A circular path v0->v1->v2….->vk->v0.

G has a cycle iff its DFS tree has a back eddge

17
Q

How to know if a directed hraph has a cycle?

A

if its depth first search reveals a back edge

18
Q

How is a dag linearized?

A

Decreasing post numbers

19
Q

DAG

A

Directed Acylclic graph

No cycles = No back edges

20
Q

Topologically sort inputs and outputs

A

Input: Graoh G = (V,E), directed acycic graoh(DAG)

Outut:topo[i]: the vertex number of the ith vetex in topological order from left to right

21
Q

Topologically sorting runtime

A

O(n+m)

22
Q

DAG Structure -

A

Source Vertex = no incoming edges
Sink Vertex = no outgoing edges

There is always at least one source and one sink

23
Q

Source Vertex

A

ight have some outgoing edges, but No incoming edges = highest post #

24
Q

Sink vertex

A

Might have some edges in, but it has no outgoing edges = lowest post #

25
Q

Alternative topological sorting algorithm

A

1) Find a sink, output it and delete it

2) Repeat 1 until the graph is empty

26
Q

Connectivity in directed graphs. How do we know v and w are strongly connected?

A

Vertices v and w are strongly connected if:

there is a path v->w and w->v

27
Q

SCC

A

Strongly connected component

Maximal set of strongly connected vertices

28
Q

How do we know a vertex is in a sink in a dag?

A

It has the lowest post order #

29
Q

Does the vertex with a the lowest order # for a general graph exist in a sink

A

no

30
Q

what characteristic of v will cause to live in a source for a general directed graph

A

highest post number

31
Q

Find a sink in an SCC?

A

reverse the edges

For a directed G=(V,E), look at G^r = (V,E^r) = Reverse of G

Run DFS on the reversed graph and take the vertex with the highest post number beause it is guarenteed to be in sink.

source SCC in GR = sink
sink SCC in GR = source

32
Q

SCC Algorithm

A

SCC(G):

input: DIrected G = (V,E) in adjency list representation

1. Construct reverse of graph
2. Run DFS on reversed graph
3. Order vertices by descending post #
4. RUn undirected connected components on G
5.
33
Q

BFS

A

Input: Graph G = (V,E), directed or undirected;vertex s E V

Output:
dist[u]: distance from s to u if s can reach u, inf otherwise
prev[z]: The parent index of vertex z

Runtime: O(n+m)
Use this for unweighted Single Source Shortest Path(SSSP)

Dist is given as the number f edges from s to u, not the sum of weights. Don’t mess this up

34
Q

Dijkstra’s

A

Input: Graph = (V,E), directed or undirected vertex s EV; positive edge weights w

Output:
dist[u]: distance from s to u if s can reach i, inf otherwise
prev[z]: the parent index of vertex z

Runtime: O(n+m) * O(logn) = O((n+m)logn)

Use this for non-negative weighted SSSP

It is paramount to understand that if weights are involved then we can’t get to O(n+m)

35
Q

SAT Problem - Input and output

A

input: formula in CNF with n variables and m clauses

Each clause has at most two literals

Any clause with one literal can be satisfied and removed leaving us with a CNF f’ of only clauses with exactly two literals

Output: An assignment of T or F for each variable in f if it can be satisifed. No if it cannot be satisifed

36
Q

SAT Run time

A

O(n+m)

37
Q

SAT Properties

A

Each variable has two literals x and !x

n variables have 2n literals [x1, !x1, x2, !x2, …xn, !xn]

Note that the variables and literals are not provided separately. They are part of the formula. Finding all variables is O(m) as we must look at al clauses and list them all

We call SAT problems kSAT where k is replaces by the maximum number of literals in each clause. If there is no limit then it is simply called a SAT problem

k>= 3 are NP complete problems. 2SAT is P

The 2SAT algorithm transforms a 2SAT CNF into a graph and solves it

38
Q

How does the 2SAT algorithm transfor a 2SAT CNF into a graph and solve it?

A

Create a graph G fro f by mapping the CNF to 2n vertices(one for each literal) and 2m edges(one for each implication)

2) RUn the SCC algorithm on G
3. If Ax E X, x and !x are in different SCCs then the CNF is satisifiable else return NO

1. Set all source literals to False and remove the source SCC
2. Set all sink literals to True and remove the sink SCC
3. Repeat steps 4 and 5until all literals are set
4. Return the variable assignents
39
Q

Kruskals MST Algorithm

A

Input: Graph G = (VE), connected, undirected; edge weights w

Output: A minimum spanning tree defined by the edges E mst

Runtime: O(m log n)

Sort edges by weight. Grab the lightest available edge that will not create a cycle when added to the MST. Another way to look at this is to never add edgesto vertices in the same component in the MST. Keep doing this until all adges that will not create a cycle are added. This happens at exactly n-1 edges.

The more common of the ST algorithms. Gives a set of edges which is useful to construct the next input for a black box or to compare to the original G

40
Q

Edge Weights(nxn matrix)

A

This is not an algorithm. It is how we can interpret edge weights w as O(1), O(1) write, and O(m) iteration. We take the adjacency list of a graoh and

1) Initialize an nxn atrix - O(1): Note that initialiaing a matrix with no values is O(1)

2) Set each edge weight in the table: O(1) * O(m) = O(m)
Note that setting a value is when there is a cost.

By setting values where there is an edhe, we avoid the typical n^2 cost of working with an nxn table

(u,v) access is O(1)
w(u,v) write is O(1)
When we want to do an iteratyion, we use the provided adjacency list and do an O(1) access for each edge - O(n+m)

We can assume O(m) for iteration

41
Q

Union by Rank(DIsjoint Sets)

A

This is not an algoirhtm. It is a special data structure that akes the runtime of Kruskal’s MST algorithm possible

Functions
Makeset(x):Create singleton set containing just x
find(x):To which set does x belong
union(x,y): merge the sets containing x and y

Property 1: For any x, rank(x) < rank(pie(x)): A root node with rank k is created by merger of two trees with roots rank k-1

Property 2: Any root node of rank k has at least 2^k nodes in its tree

Property 3: If there are no elements overall there can be at mode n/2^k nodes of rank k. This last property implies that max rank is O(logn) Therfore, all trees have height <= logn and an upper bound on the running time of find and union.

Path compression: Extra maintenance to imporace speed.

During each find with a series of parent pointers is followed up to the root of a tree. We will change all these pointers so they point directly to the root.

The key point here is that once a node ceases to be a root, never surfaces, and its rank is forever fixed. Therefore all ranks of all notes are unchanged by path compression even though these numbers can no longer be intepreted as tree heights. In particular, properties 1-3 hold.

42
Q

Example of how to establish non graoh problem into grap

A

We set bus stops as vertices
We set bus routes as undirected edges

We run DFS on G(V,E)

We do not need runtime for this. It will always be O(n+m)

43
Q

WHich algorithms work well for MST?

A

Kruskal and Prims

44
Q

MST Problem

A

Given: Undirected graph(V,E) with weights w(e) for e E E

Gaol: Find minimal size, connected graph of min weight

FInd min weight spanning tree of G

for T C E, w(T) sum of weights of edges in the tree

45
Q

Tree properties

A

Connected acycic graph

1) Tree on n vertices and has n-1 edges
2) One path between every pair of vertices. If there was more than one path, it would be a cycle

3) Any connected G=(V,E) with |E|=|V|-1 is a tree

46
Q

Greedy approach for MST

A

Review lecture

47
Q

Kuskals algorithm

A

Kruskals(G):

Input: Undirected G=(V,E) with weights w(e)

1. SOrt E by increasing weight
2. Set X = 0
3) for e = (v,w) is in E ( go through in order) if XV e doesn’t have a cycle then: X = X Ve
4) Return (X)

X is an MST

48
Q

Big O for Kruskals

A

O(mlogn)

49
Q

Cuts

A

Set of edges that partition vertices into 2 sets S and the complement of S.

Edges crossing between S and S bar

50
Q

Cut Property

A

For undirected graph G = (V,E)

Take a subset of edges are apart of an MST T

Take subset S where no edge of X is in the cut(S, S bar)

Let e* be minimum weight edge in cut(S and S bar)

ANy edge minimum weight across a cut is part of some MST

51
Q

Prim’s

A

MST akin to Dijstra’s

Use cut property to prove correctness of Prim’s algorithm

52
Q

Approach to solving problems with graphs

A
1. Figure our black box
2. State mod if needed. You may use black box as as is
3. State steps. What changes do you need for the input. Repeat with more black boxes if needed
4. Prove correctness
5. Analyze runtime
53
Q

How do directed and undirected graphs behave differently in DFS

A

Pre/Pst numbers give info on how a graph could be explored. Some numbers are interchangable between two vertices, some can never be swapped.

Pre/post numbers in undirected graphs give infrmation on how a graph would be explored given a starting point. The root node that is used can determine entirely different pre/post numbers

ccnum in a directed graph only works if you always explore fro a sink vertex if all previous sink SCCs were removed.

A source vertex will be able to reach al connected vertices and the ccnum will be somewhat useless.
ccnum in an undirected graph gives how many components there are.

54
Q

Explore input and output

A

Input: Graph G = (V,E), directed or undirected; v in V

Output: visited[u] is set to true for all nodes u reachable from v

Any other data structure that DFS has if needed

55
Q

Explore runtime

A

O(m) if run as part of DFS

O(n+m) if run by itself and visited[] needs to be created

56
Q

Explore properties

A

This is a subroutine of DFS and does most of the work in DFS
It runs on all edges and vertices that are reachable from the provided v

Use this is you only care about single source but need DFS information
All of the outputs in DFS are availble in explore as well
In fact ccnum, prev, pre, pst are set in the explore subroutine
If you are allowed to modify your black box algorithm, it is likely to be in explore

57
Q

Topological sorting

A

This algorith works by running DFS on the DAG and using the post order number to sort the vertices from highest post# to lowest post#

When a DAG is ordered from source to sink then all edges go from left to right
Either run a graph as topological sort or do it as part of DFS

58
Q

Strongly Connected Algorithm Inputs and outputs

A

Input: Graph G=(V,E), directed
Output: G SCCC = (V SCCC, E SCCC) where G is a metagraph(a DAG) with each SCC in G forming a vertex in V and each edge between SCCCs in W.

ccnum[.] comes from DFS used underneath this algorithm

V SCCC will look like 1,2,3,4 which are ccnums

E SCCC will look like (1,2), (2,3), (3,4) which are edges between V SCCC

Use ccnum[u] and cccnum[v] to find which SCC u and v are in

59
Q

SCC Runtime

A

O(n+m)

60
Q

SCCC Algorithm

A

Works by running two DFSses.

Run once with pre/post order numbering on Gr which is the reverse of G

Sort V in descending post order numbers. This gives sink to source

Run again on G with V sorted(not on Gr)

Output will have ccnum representing SCC with highest = source and lowest - sink

We use the ccnum to gather up vertices that belong to each SCC

We then check all original edges and their endpoints. If the endpoints are in different SCCs and a corresponding E between those SCCs that does not already exist, we add a E to represent the edge from one SCC to another.

61
Q

How does 2SAT algorithm transform a 2SAT CNF into a graph and solve it

A
1. Create a graph G from f by mapping the CNF to 2n vertices one for each literal and 2 edges one for each implication
2. Run SCC on G
3. If x is in X and x and !x are in different SCCs then CNF is satisfiable else return NO
4. Set all source literals to False and remove the source SCC
5. Set all sink literals to True and remove the sink SCC
6. repeat steps 4 and 5 until all literals are set
7. Return the variable assignments
62
Q

Prims Input/Output and runtime

A

Input: Graph G = (V,E) connected, undirected, edge weights w

Output: A minimum spanning tree defined by array prev

Runtime: O(mlogm) simplified to O(mlogn)

63
Q

Prim’s MST Algo

A

Start with arbitrary vertex v and put into subtree S of included vertices. In each iteration, grow S by adding the lightest edge between a vertex in S and a vertex outside of S. Continue untill all vertices are in S. This happens at exactly n-1 edges

Very similar to Dijkstra’s in construction. Not as used as Kriskals but can be very useful if you need a path in the MST output. Lack of edge information makes it unlikely to be what you want

64
Q

max flow inputs and outputs

A

Inputs: G(V,E), s,t,c

Directed Graph G(V,E)
s,t in V such that s is a source and t is a sink

c is capacities such that c(u,v) > 0 for (u,v) in E
This set of inputs is also known as a flow network

Outputs: f*
f is a function or vector such that f(u,v) gives flow of u,v

It is a general term to represent flow at some point
f is constantly updated while the algorithm runs

f* is the max flow such that the soe over f(.,t) is maximized
C = capacity = size of max flow = size(f
) some of f*(., t)

65
Q

Maxflow rules

A

Cycles are ok
Anti-parallel edges are not ok
Instead add ane xtra vertex in one of the ani-parallalel edges with the sae direction and c(e) as the original edge

for all (u,v) in E:
0 <= f(u,v) <= c(u,v)
Flow of each edge is a non-negative, not exceeding capacity of each edge

for all v in V where v != s and v != t
total flow into v is equal to total flow out of v

Total flow out of s is equal to total flow into t
Consequently C is equal to both

66
Q

Max flow general steps

A

Set f(u,v) = 0 for (u,v) in E
Build residual network G(V,E)
Check for any path in G from s to t. Can use DFS or BFS
If none is found then we are done

If found, get the min capacity along path as c(path)
Augment by c(path) units along the path. For each forward edge(u,v) along the path, increase f(u,v) by c(path) units.

FOr each backwards edge(v,u) along the path, decrease f(u,v) by c(path) units

Recurse to build residual network step

67
Q

Build Residual Network inputs/outputs and runtime

A

Input: G(V,E),c,f
Outputs(G=(V,E), cf

Runtime: for (u,v) in E - O(n+m)

O(n+)
We assume G is a connected graph, m >= n-1; O(n+m) = O(m)

68
Q

Residual Network Algorithm

A
```for (u,v) in E
if f(u,v) < c(u,v):
Theseare forward edges```
```if f(u,v) > 0:
These are backward edges```
69
Q

Ford-Fulkerson

A

User General Max Flow Algorithm
Assumes all capacities are positive integers

Runtime:

1) Rounds: Flow increases by >= 1 per round, so there are C rounds where C is the size of max flow. O(C)
2) Per round:

Explore/DFS/BFS - O(n+) or O(m)
O(n-1) to augment f
O(n+m) + O(n-1) = O(n+m)
We assume G is a connected graph, m >= n-1; O(n+m) = O(m)

We assume G to be a connected not beceause the algorith requires it but because a disconnected graph isn’t well defined for a ax flow problem. Suppose s can’t reach it. What is our solution?

O(m) * O(C) = O(mC)

70
Q

Edmonds-Karp

A

Specific version of Fold-Fulkerson with the path finding well defined to use BFS. Rather than using any path, it uses shortest path

Use General max flow algorith
Capacities can be any positive number
Replace step 3 in the genera; ax flow to check for shortest path in G from s to t. Use BFS. If non is found we’re done

Runtime

Rounds
Delete and add any edge <= n/2 times - O(n)
There are m edges in a graph
O(nm)

Per round: BFS - O(n+m)
O(n-1) to augment f
O(n+m) + O(n-1) = O(n+m)
We assume G is a connected graph m > n-1; O(n+m) = O(m)

O(m) * O(nm) = O(nm^2)

71
Q

DAG properties

A

In a dag every edge leads to a vertex with a lower post nuber

Every dag has at least one source and at least one sink

A directed graph has a cycle if and only if its DFS reveals a back edge

72
Q

Are cycles ok in a flow network

A

Yes.

73
Q

For Fulkerson is psuedo polynomial. TF

A

True

74
Q

Ford FUkerson

A

Find any path and run DFS or BFS and take any path

75
Q

Ford fulkerson vs Edmonds-Karp

A

FF: Assume capacities are integers. Does not always terminate on irrational capacities. Can convert rational capacirties to integers(out of scope

FLow increases by >= 1 per round so there are C rounds where C is the size of max flow
Runs in O(mC)
Can be faster than Edmonds-Karp is C is bounded by n or m

76
Q

Ford fulkerson vs Edmonds-Karp

A

FF: Assume capacities are integers. Does not always terminate on irrational capacities. Can convert rational capacirties to integers(out of scope

FLow increases by >= 1 per round so there are C rounds where C is the size of max flow
Runs in O(mC)
Can be faster than Edmonds-Karp is C is bounded by n or m

```Edmonds carp
Uses BFS to find next path
works with irrational capacities
O(nm^2)
Generally considered faster for unbounded capacities```