Section notes for Week 11: Graph Algorithms

Section Notes for Week 11: Graph Algorithms

Story thus far

  1. Conversion of Java into C++
  2. C++ I/O
  3. C++ Preprocessor
  4. Pointers, local variables, and the stack
  5. new and delete
  6. Code Review
  7. More Example Code
  8. Reference Variables and Method invocation
  9. Const keyword
  10. Constructors, copy constructors, and destructors
  11. Operator Overloading and the string class
  12. Big 3
  13. Templates and the array class
  14. Discussion of the List ADT
  15. A look at an array list implementation
  16. A look at the Linked List implementation
  17. Brief Discussion of Doubly Linked Lists
  18. Iterators
  19. Function Objects
  20. Generic Programming
  21. Trees
  22. Binary Search Trees
  23. Tree Traversal
  24. Trees with Back Pointers
  25. Exam Review
  26. Motivation
  27. Cases
  28. Rotations
  29. The Big Picture
  30. Iterators in AVL FIXME: Broken

Today:

  1. Graph Searching
  2. Graph Sorting
  3. Least cost paths
  4. Minimum Spanning Trees

Graph Searching

Ok, we have these graph structures we saw last time. Wouldn't it be nice if we can actually use them for something? How about searching for example.. That would be a nice start..

So how are we going to do this? Can it be done iteratively? (Without extra data structures?)


Remember tree traversals? Recall that we could do traversals using either recursion (basically a Stack) or a Queue. Well, as it turns out, we can do the same thing for a graph, and since a graph has cycles, we must mark a vertex as soon as we visit it so that we don't break when we encounter cycles. Furthermore, a graph class can actually have several disjoint graphs in it. So we must actually loop over all vertices every time.

The depth-first search will work just like the familar traversals of a tree, where we go all the way down to the leaves before returning up. However, there are cycles, and disconnected nodes, so we must handle that.


 
   DFSWrapper()
   {
      for all vertices v
         v.encountered = false;
      for all vertices v
         if v.encountered == false
            DFS(v);
   }


   DFS(v) 
   {
      v.encountered = true;
      for all neighbors w of v
         if w.enountered == false
            DFS(w);
   }

What's the running time?

So let's think about how to do this using a Queue. Recall the level order of a tree. We want to do something similar for graphs, but we must iterate over all vertices to make sure we encouter them. So we can think of this algorithm as sort of "Fanning out"

PseudoCode, if you please:


BFS:


   BFSWrapper()
   {
      for all vertices v
         v.encountered = false;
  
      for all vertices v
         if v.encountered == false
            BFS(v);
   }

   BFS(v1)
   {
      v1.encountered = true;
      Q.enqueue(v1);

      while (!Q.IsEmpty())
      {
         v = Q.Dequeue();
         for all neighbors w of v    
            if (w.encountered == false)  
            {
               w.encountered = true;
               Q.Enqueue(w);
            }
      }
   }



What's the running time?

Question: Is the ordering of the vertices in either of these algorithms well-defined?

Graph Sorting

A topological sort is an ordered search of a DAG (a Directed Acyclic Graph) where you visit/print each vertex only when you have been to all of it's parent vertices. A vertex with no parents (or with it's parents already visited) is said to have an indegree of 0.

Again, we have a choice for a "bag" object to store the vertices ready for traversal. Let's deal with the Queue this time, because it's more intuitive.

PseudoCode:



   for each vertex v
      Calculate "indegree" of v // How?
	  if it is 0, Q.enqueue(v)

   
   while(!Q.empty())
   {
      v = Q.dequeue();
      reduce the indegree of all of v's neighbors by 1.
      If any neighbor has indegree 0
          Q.enqueue(neighbor);
   }
   
   if(Graph not empty)
      Not a DAG.

So the example that works best is class prereq's. Jason used it in lecture, but I suspect only a couple of you are actually going, so I'm just going to reuse it here.

Running time?

Ok, what about using recursion? Well, as it turns out, DFS can be used almost straight away to do a top sort. What Jason described was that you number all the vertices in reverse (starting with |V|) as you return from the DFS call (a PostOrder type of deal).

So this will get a NUMBERING on our vertices.. Any sugestions on how to actually PRINT them?

Question: Is this ordering truly well defined?

Least Cost Paths (Dijkstra's)

Ok, so we can sort these bastards. What if we want to find the shortest route from a particular source to another one?

So given the graphs we've seen so far, we actually have an algorithm that can be used already. Which? And why?

However, what if we want to represent distances on the graph, for example miles between cities for planning connecting flights? Well, then we need a new algorithm. Dijkstra's Algorithm! (That's pronounced Dike-stra).

The general approach is to initialize all veritices to unprocessed and infinite distance. Then, set the distance of your source to zero. From here:


for i = 1 to V
{
    select unprocessed vertex v with lowest distance to v 
    mark v processed 
    for each of v's unprocessed neighbors w, 
        if distance to v + weight(v,w) < current distance to w then
	       distance to w = distance to v + weight(v,w)
}

So my first intuition when looking at this sketch is to say "Well, lets make select as fast as possible." Right? So what data structure would be good for this?

A heap will give us a O(lg V) select (delete_min). Also, we update the distance at most once per edge, taking O(lg V) (insert). So our total running time is O(V lg V + E lg V) = O( (V + E) lg V).

So that was easy, right? Nice and optimized right off the bat... Or is it? What's wrong with this? Is it really (asymptotically) fast in all cases?

Also, how can we actually store the path?

Also this gives us the distance to ALL nodes. Is there a faster version for just A node? Consider the work that MUST be done to find a path.

  • Work example.

    Minimum Spanning Trees

    Ok, so now we have paths. What's the next step in minimizing our graph?

    Well, we could minimize the TOTAL cost inherent in a graph. For designing a highway system, for example.

    We have two algorithms for this. Prim's and Kruskal's.

    Kruskal's

    Kruskal's works a bit more intuitively, so let's discuss it first. So to build a MST, need V-1 edges to connect to the V, and the total weight must be minimized. So if we minimize the weight of each edge as we select it, we should be set.

    So we basically build a heap out of all edges, and add the smallest V-1 of them that don't create cycles to the tree.

    How to we detect cycles? Well, a cycle is created when you add an edge to a set of nodes that already link to eachother (and thus are all connected).. So what sort of data structure can tell us this?

    So when you select a smallest edge, you must first do a Find on it's endpoints to see if they are already connected, if they are not, add the edge, and Union the two vertices in you disjoint sets. If they are, find another minimum edge, and try again.

    So, Jason's pseudocode:

    
    loop V-1 times
    {
        choose minimum-weight edge out of all edges not
        yet accepted or rejected.
    
        if it would cause a cycle if we added it to the tree,
           then reject it and choose the next minimum-weight edge
           and check that one. Continue rejecting and choosing
           the next minimum-weight edge until we find an edge that 
    	   does not cause a cycle. 
    
        Once we find an edge that does not cause a cycle, 
        accept it and add it to the minimum spanning tree.
    }
    
    
    

    Prim's

    Prims works just like Dijkstra's. The basic idea is to add edges to a tree if they connect a node with the minimum weight. So you keep track of the weight to each node (but not the weight of the PATH to each node), and continously add the node with the minimum weight.

    pseudocode:

    
    for i = 1 to V
    {
        select unprocessed vertex v with lowest distance to v 
        mark v processed (add an edge to v from it's min parent)
        for each of v's unprocessed neighbors w, 
            if weight(v,w) < current weight to w
               min weight to w = weight(v,w)
               min parent of w is v
    }
    
    

    Work out examples.

    Question: is a path through a minimum spanning tree always a shortest path?