Section notes for Week 11: Graph Algorithms

## Section Notes for Week 11: Graph Algorithms

### Graph Searching

Ok, we have these graph structures we saw last time. Wouldn't it be nice if we can actually use them for something? How about searching for example.. That would be a nice start..

So how are we going to do this? Can it be done iteratively? (Without extra data structures?)

Remember tree traversals? Recall that we could do traversals using either recursion (basically a Stack) or a Queue. Well, as it turns out, we can do the same thing for a graph, and since a graph has cycles, we must mark a vertex as soon as we visit it so that we don't break when we encounter cycles. Furthermore, a graph class can actually have several disjoint graphs in it. So we must actually loop over all vertices every time.

The depth-first search will work just like the familar traversals of a tree, where we go all the way down to the leaves before returning up. However, there are cycles, and disconnected nodes, so we must handle that.

``````

DFSWrapper()
{
for all vertices v
v.encountered = false;
for all vertices v
if v.encountered == false
DFS(v);
}

DFS(v)
{
v.encountered = true;
for all neighbors w of v
if w.enountered == false
DFS(w);
}
```
```

What's the running time?

So let's think about how to do this using a Queue. Recall the level order of a tree. We want to do something similar for graphs, but we must iterate over all vertices to make sure we encouter them. So we can think of this algorithm as sort of "Fanning out"

PseudoCode, if you please:

``````
BFS:

BFSWrapper()
{
for all vertices v
v.encountered = false;

for all vertices v
if v.encountered == false
BFS(v);
}

BFS(v1)
{
v1.encountered = true;
Q.enqueue(v1);

while (!Q.IsEmpty())
{
v = Q.Dequeue();
for all neighbors w of v
if (w.encountered == false)
{
w.encountered = true;
Q.Enqueue(w);
}
}
}

```
```

What's the running time?

### Graph Sorting

A topological sort is an ordered search of a DAG (a Directed Acyclic Graph) where you visit/print each vertex only when you have been to all of it's parent vertices. A vertex with no parents (or with it's parents already visited) is said to have an indegree of 0.

Again, we have a choice for a "bag" object to store the vertices ready for traversal. Let's deal with the Queue this time, because it's more intuitive.

PseudoCode:

``````

for each vertex v
Calculate "indegree" of v // How?
if it is 0, Q.enqueue(v)

while(!Q.empty())
{
v = Q.dequeue();
reduce the indegree of all of v's neighbors by 1.
If any neighbor has indegree 0
Q.enqueue(neighbor);
}

if(Graph not empty)
Not a DAG.
```
```

So the example that works best is class prereq's. Jason used it in lecture, but I suspect only a couple of you are actually going, so I'm just going to reuse it here.

Running time?

Ok, what about using recursion? Well, as it turns out, DFS can be used almost straight away to do a top sort. What Jason described was that you number all the vertices in reverse (starting with |V|) as you return from the DFS call (a PostOrder type of deal).

So this will get a NUMBERING on our vertices.. Any sugestions on how to actually PRINT them?

### Least Cost Paths (Dijkstra's)

Ok, so we can sort these bastards. What if we want to find the shortest route from a particular source to another one?

So given the graphs we've seen so far, we actually have an algorithm that can be used already. Which? And why?

The general approach is to initialize all veritices to unprocessed and infinite distance. Then, set the distance of your source to zero. From here:

``````
for i = 1 to V
{
select unprocessed vertex v with lowest distance to v
mark v processed
for each of v's unprocessed neighbors w,
if distance to v + weight(v,w) < current distance to w then
distance to w = distance to v + weight(v,w)
}
```
```

So my first intuition when looking at this sketch is to say "Well, lets make select as fast as possible." Right? So what data structure would be good for this?

A heap will give us a O(lg V) select (delete_min). Also, we update the distance at most once per edge, taking O(lg V) (insert). So our total running time is O(V lg V + E lg V) = O( (V + E) lg V).

So that was easy, right? Nice and optimized right off the bat... Or is it? What's wrong with this? Is it really (asymptotically) fast in all cases?

Also, how can we actually store the path?

Also this gives us the distance to ALL nodes. Is there a faster version for just A node? Consider the work that MUST be done to find a path.

• Work example.

### Minimum Spanning Trees

Ok, so now we have paths. What's the next step in minimizing our graph?

Well, we could minimize the TOTAL cost inherent in a graph. For designing a highway system, for example.

We have two algorithms for this. Prim's and Kruskal's.

#### Kruskal's

Kruskal's works a bit more intuitively, so let's discuss it first. So to build a MST, need V-1 edges to connect to the V, and the total weight must be minimized. So if we minimize the weight of each edge as we select it, we should be set.

So we basically build a heap out of all edges, and add the smallest V-1 of them that don't create cycles to the tree.

How to we detect cycles? Well, a cycle is created when you add an edge to a set of nodes that already link to eachother (and thus are all connected).. So what sort of data structure can tell us this?

So when you select a smallest edge, you must first do a Find on it's endpoints to see if they are already connected, if they are not, add the edge, and Union the two vertices in you disjoint sets. If they are, find another minimum edge, and try again.

So, Jason's pseudocode:

``````
loop V-1 times
{
choose minimum-weight edge out of all edges not
yet accepted or rejected.

if it would cause a cycle if we added it to the tree,
then reject it and choose the next minimum-weight edge
and check that one. Continue rejecting and choosing
the next minimum-weight edge until we find an edge that
does not cause a cycle.

Once we find an edge that does not cause a cycle,
accept it and add it to the minimum spanning tree.
}

```
```

#### Prim's

Prims works just like Dijkstra's. The basic idea is to add edges to a tree if they connect a node with the minimum weight. So you keep track of the weight to each node (but not the weight of the PATH to each node), and continously add the node with the minimum weight.

pseudocode:

``````
for i = 1 to V
{
select unprocessed vertex v with lowest distance to v
mark v processed (add an edge to v from it's min parent)
for each of v's unprocessed neighbors w,
if weight(v,w) < current weight to w
min weight to w = weight(v,w)
min parent of w is v
}
```
```

Work out examples.

Question: is a path through a minimum spanning tree always a shortest path?