## 2011年5月11日 星期三

### [ Data Structures with Java ] Section 25.5 : Dijkstra's Minimum-Path Algorithm

Preface :
The shortest-path problem finds the path of shortest length connecting a starting vertex to each of the reachable vertices in the graph. The algorithm uses a breadth-first search of the vertices. Minimum path is a similar problem for weighted graphs. The problem determines a path of minimum weight from a starting vertex to each reachable vertex in the graph. As the graph in below figure illustrates, a shortest path may not be the minimum path. Assume A is the starting vertex and E is the reachable vertex. Three paths, A-B-E, A-C-E and A-D-E have path length 2, with weights 15, 17 and 13 respectively. The minimum path is A-C-D-E, with weight 11 but path length 3.

The Dijkstra algorithm solves the minimum-path problem. It uses a greedy strategy that solves the problem in stages. At each stage, we select the adjacent vertex x that defines a minimum path from the starting vertex. The term "greedy" indicates that the strategy attempts to maximum short-term gains, even through the decision might need to be reversed as we discover new vertices. In the sample graph in upper figure, we start at A and discover neighbors B, C and D. Vertices C and D defines path A-C and A-E having path weights 4 and 8, respectively. However, the path A-B has minimum weight among all of the neighbors of A. From vertex B, we discover its neighbor E using edge e(B, E) with weight 12. The total weight of path A-B-E is 15. We record our finding for vertex E that include the value 15 for the path weight and B as the parent. Latter, we discover that using C as an intermediate vertex yields a path A-C-E that connects A to E with a smaller path weight 14. Properties for E are updated to have a new path weight 14 and parent C. As new vertices are discovered, we continually make updates. In the end, we discover that the path A-C-D-E with weight 11 with weight 11 is the minimum path.

Designing the Dijkstra Algorithm :
The Dijkstra minimum-path algorithm uses the iterative strategy that we employed for the shortest-path problem, with one major difference. The search uses a minimum-priority queue rather than a queue to store the vertices. To define objects in the priority queue, we declare the static inner class MinInfo. A MinInfo object contains a vertex reference and the path weight as data members. The vertex is the ending vertex on a path from the starting vertex andpathWeight is sum of the weights for the edges of the path. The class implements the Comparable interface by defining compareTo() using the pathWeightattribute. This allows us to use the object in a priority queue.

- MinInfo class :
1. /**
2. * BD :
3. *   Priority queue data used by minimumPath() and minSpanningTree() algorithms
4. */
5. private static class MinInfo implements Comparable> {
6.     public T endV;
7.     public int pathWeight;
8.
9.     @Override
10.     public int compareTo(MinInfo o) {
11.         ...
12.     }
13. }

Each step in the algorithm removes a MinInfo object from the priority queue and identifies the vertex. Because no subsequent step could find a new path to the vertex with a smaller weight, we have the minimum-path weight and can color the vertex BLACK(visited). The next action is to look at each of the neighbors of the vertex. For each neighbor that is not BLACK, check whether adding the edge to the minimum path from the starting vertex to the current vertex will create a path from the starting vertex to the neighbor that is "better" than any which have been determined previously. The data value for the neighbor vertex stores the minimum-path weight for any of the prior paths or is infinity when the neighbor is initially discovered. If the new path is better, create a MinInfo object with the neighbor and the total path weight as arguments and insert it into the priority queue. At the same time, update the data value and parent reference for the neighbor. The algorithm terminates whenever the priority queue becomes empty or when the number of visited vertices matches the vertex size of the graph. The priority queue could be come empty when a relatively small number of vertices are reachable from the starting vertex.
Let us illustrate the algorithm by using the upper graph, which is repeated by your reference. Vertex A is the starting vertex. To simplify notation, let MI(v, w) designate a MinInfo object with v as the ending vertex for path P(A, v) and with total path weight w. The algorithm begins with vertex A :
Setup:

Assign ∞ as the data value for each vertex. For starting vertex A, assign its data value to be 0 and let A be the parent reference. Create the MinInfo object MI(A,0) that represents a path from A to itself with initial weight 0. The object is the first entry in the priority queue.

Each iterative step pops a MinInfo object from the priority queue and identifies its vertex. If the edge from the vertex to a nonvisited neighbor creates a better path, updates occur to both data field and parent reference field of the neighbor. Each column in the table lists the data value and current parent reference for each vertex after any updates occur.
Step 1:

Pop MI(A,0) from the priority queue. Color A BLACK to indicate that it is visited with the minimum path from A to A having weight 0. The vertices B, C and D are neighbors of A with data value ∞. Create MinInfo objects MI(B,3), MI(C,4) and MI(D,8) indicating that paths from A to the vertices have weight 3, 4 and 8, respectively. Use the weights to update the data fields in the three vertices and assign A as their parent references. Push the objects into the priority queue.

Step 2:

Pop the element with minimum weight. In this case, pop MI(B, 3) from the priority queue and color the vertex B BLACK. The only neighbor of B is vertex E. Using the minimum path from A to B with weight 3 and adding the weight of the edge (B, E) creates a "better" path A-B-E from A to E with weight 12. Clearly, this path is better because currently E has never been discovered and its data value is ∞. Let 12 be the new data value for E and assign B as its parent reference. Create the object MI(E,12) and push it into the priority queue.

Step 3:

Pop MI(C,4) from the priority queue and color the vertex C BLACK. The vertex has two nonvisited neighbors, D and E. The data value and parent reference for D indicates a path exist from A to D with weight 8 containing A as the parent of D. Using the minimum path from A to C with weight 4 and adding edge (C,D) with weight 2 creates a path with total weight 6, which is better then the current path to D :
New weight for path A-C-D = weight to C + weight (C,D) = 4+2=6

Update the fields for D so that the data is 6 and the parent reference is C. Push the object MI(D,6) into the priority queue. Note that the priority queue now has two MinInfo objects that reference vertex D. The one with weight 6 will come out of the priority queue first. The data field for vertex E indicates a path from A of weight 15. A second path from A to C followed by the edge (C,E) with weight 13 has total weight 17 and is not a better path.

Step 4:

Pop MI(D,6) from the priority queue and color the vertex D BLACK. The neighbor, E has a path from A with weight 15. A new path that uses the minimum path to D and the edge (D,E) with weight 5 creates a better path :
New weight for path A-C-D-E = weight to D + weight (D,E) = 6+5=11

Assign 11 as the new data value for E and let D be the parent reference. Push the object MI(E,11) into the priority queue.

Step 5-6:

Pop MI(D,8) from the priority queue. The vertex is already BLACK from the step 4, so proceed to the next step. Pop MI(E,11) from the priority queue and color it BLACK. The algorithm terminates because all five vertices have been visited.

Why It Works :
To verify the Dijkstra algorithm results in a minimum path, assume that the algorithm finds a path from starting vertex vs to the ending vertex ve, which is not optimal. That is, a second path connects the vertices with a smaller weight. Assume this second path and the Dijkstra path are the same up to vertex u and that the weight of path P(vs, ve) found by Dijkstra's algorithm is W. The better (second) path has an intermediate vertex x that is in the priority queue but is not marked. The weight to x is :
w = weight for P(vs, x) = weight of P(vs, u) + weight (u, x)

The weight w must be less than W, so the MinInfo(x,w) object will come out of the priority queue before the path found by the algorithm. Continuing in this way, all the vertices on the better path will come out of the priority queue, and the algorithm will find the optimal path, contradicting our assumption that the algorithm does not find the minimum path :

Implementing the minimumPath() Method :
The method minimumPath() implements the Dijkstra algorithm. The method has the same signature as shortestPath(). The arguments are a graph and the starting vertex. The storage collection for MinInfo objects is the minimum-priority queue minPathPQ. The design of the algorithm is also modeled on the shortest-path algorithm. The boolean variables foundMinPath indicates when the minimum path is found. The search is an iterative process that uses a minimum-priority queue to store MinInfo objects.

- Method minimumPath() :
1. public static  void minimumPath(JGraph g, T sVertex) {
2.     if(!g.containsVertex(sVertex)) throw new java.lang.IllegalArgumentException("minimumPath(): Vertices "+sVertex+" doesn't exist");
3.     HeapPQueue> minPathPQ = new HeapPQueue>(new Less());
4.     HashMap verticesColor = new HashMap();
5.     Iterator iter = g.vertexSet().iterator();
6.     while(iter.hasNext()) verticesColor.put(iter.next(), VertexColor.WHITE);
7.     g.initData();
8.     g.setData(sVertex, 0);
9.     g.setParent(sVertex, sVertex);
10.     Iterator edgeIter;
11.     minPathPQ.push(new MinInfo(sVertex, 0));
12.     int numVisited = 0, numVertices = verticesColor.size();
13.     while(!minPathPQ.isEmpty() && numVisited
14.         MinInfo min = minPathPQ.pop();
15.         edgeIter = g.getNeighbors(min.endV).iterator();
16.         while(edgeIter.hasNext()) {
17.             T currVex = edgeIter.next();
18.             if(verticesColor.get(currVex).equals(VertexColor.WHITE)) {
19.                 int curWht = g.getData(currVex);
20.                 boolean flag = false;
21.                 if(curWht==PathBean.INFINITE) {
22.                     flag = true;
23.                 } else {
24.                     int newWht = g.getData(min.endV)+g.getWeight(min.endV, currVex);
25.                     if(newWht
26.                         flag = true;
27.                     }
28.                 }
29.                 if(flag) {
30.                     g.setData(currVex, g.getData(min.endV)+g.getWeight(min.endV, currVex));
31.                     g.setParent(currVex, min.endV);
32.                     minPathPQ.push(new MinInfo(currVex, g.getData(currVex)));
33.                 }
34.             }
35.         }
36.         if(!verticesColor.get(min.endV).equals(VertexColor.BLACK)) {
37.             verticesColor.put(min.endV, VertexColor.BLACK);
38.             numVisited++;
39.         }
40.     }
41. }

- Running-Time Analysis
Coloring the vertices WHITE and initializing the dataValue field is O(V) operation. The actions in the loop "while(!minPathPQ.empty())" dominate the running time. In the worst case, the algorithm pushes all edges into the priority queue and pops all edges. Each push() and pop() operation is O(n log2 n), so the running time for all these priority-queue operations is O(E log2 E). Then we know :

The running time for Dijkstra's algorithm is thus O(V + E log2 V).

Minimum Path in Acyclic Graphs :
When the weighted digraph is acyclic, the problem of finding minimum paths is greatly simplified. The depth-first search creates a list of vertices in topological order :
dfsList:[v0, v1, ..., vn-1]

Assume vi is the starting vertex for the minimum-path problem. Vertices in the list v0 to vi-1 are not reachable from vi because a path from vi would indicate that the vertex has an earlier finishing time and so would come after vi in dfsList.
The algorithm begins at vi. After initializing the data value for all of the vertices to ∞, set the data value for vi to 0 and its parent reference to vi. This establishes a minimum path from vi to vi with weight 0. Iteratively scan the tail of the list. For each vertex v in the index [i,n), its data value is the minimum-path weight for any path from vi to v. If the value is ∞, then no path exists, and you proceed to the next vertex in the list. Otherwise, look at each neighbor w of v. We have found a path from vi to w that goes through v. The weight of this path is P(vi, v) + weight(v,w). Compare this weight with the weight of a previously discovered path from vi to w. The notation data(v) is the value of the data field for vertex v, which corresponds to the weight of a path from vi to v. Perform the comparison :
weight(P(vi, v) + (v, w)) = data(v) + weight(v,w) < data(w)

If the new path to w through v is better, update the data and parent reference fields for w. When the sequential scan concludes, the data field for each vertex has the minimum-path weight. For all reachable vertices, the parent reference field can be used to reconstruct the actual minimum path.
The method dagMinimumPath() implements the minimum-path algorithm for an acyclic graph. The method has parameters for the graph and the starting vertex :

- Method dagMinimumPath() :
1. /**
2. * BD :
3. *   In the directed acyclic graph, find the path with minimum total weight from sVertex to each
4. *   vertex reachable from sVertex; upon return, the dataValue field of each vertex in g is either
5. *   the minimum path weight to the vertex or is INFINITE if the vertex was reachable from sVertex;
6. *   call path(g, sVertex, v) to find the minimum path from sVertex to v.
7. */
8. public static  void dagMinimumPath(JGraph g, T sVertex) {
10.     topologicalSort(g, tList);
11.     g.initData();
12.     g.setData(sVertex, 0);
13.     g.setParent(sVertex, sVertex);
14.     int dataValue, w;
15.     T currVex, neighborVex;
16.     Iterator topSortIter = tList.iterator(), setIter;
17.     while(topSortIter.hasNext()) {
18.         currVex = topSortIter.next();
19.         if((dataValue=g.getData(currVex))!=PathBean.INFINITE) {
20.             setIter = g.getNeighbors(currVex).iterator();
21.             while(setIter.hasNext()){
22.                 neighborVex = setIter.next();
23.                 w = dataValue + g.getWeight(currVex, neighborVex);
24.                 int nw = g.getData(neighborVex);
25.                 if(nw!=PathBean.INFINITE){
26.                     if(w < nw) {
27.                         g.setData(neighborVex, w);
28.                         g.setParent(neighborVex, currVex);
29.                     }
30.                 } else {
31.                     g.setData(neighborVex, w);
32.                     g.setParent(neighborVex, currVex);
33.                 }
34.             }
35.         }
36.     }
37. }

A simple intuitive argument indicates why the algorithm works. At a vertex v in the sequential scan of dfsList, we use its current data value as the minimum-path weight from vi to v. This value, data(v), is the weight of the minimum path from vi to v. For there to be a better path, there must be an unvisited vertex, u, reachable from vi, that has an edge to v :

This is not possible, because topological order guarantees that u will come earlier in dfsList.
- Running-Time Analysis
The algorithm first creates a topological sort of the vertices with running time O(V+E). A loop visits all the vertices in the graph once and examines edges that emanate from each vertex only once. Access to all of the vertices and edges has running time O(V+E) and so the total running time for the algorithm is O(V+E).

Supplement :
* [ 資料結構 小學堂 ] 圖形結構 : 圖形最短路徑 (單點對全部頂點)

### [ FP In Python ] Ch4. Higher-Order Functions

Preface   In the  last chapter  we saw an iterator algebra that builds on the  itertools  module. In some ways,  higher-order functions  (...