## 2011年5月15日 星期日

### [ Data Structures with Java ] Section 25.6 : Minimum Spanning Tree

Preface :
The minimum-path algorithm finds an optimal path connecting a starting vertex with each vertex in a directed graph. A more general problem deals with a connected undirected graph. We want to find an acyclic set of edges that connect all of the vertices in the graph with the smallest total weight. Let E be the set of edges in the graph and T be the acyclic subset of E. Because T is acyclic and connects (spans) all the vertices, it forms a tree called the minimum spanning tree. The concept has important applications. A network connects hubs in a system. The minimum spanning tree links all of the nodes in the system with the least amount of cable. There are a variety of minimum-spanning-tree algorithms. One is Prim's algorithm, which builds the tree vertex by vertex. At each stage, the algorithm adds a new vertex and an edge that connects the new vertex with the ones already in the tree.

Prim's Algorithm :
Prim's algorithm creates a minimum spanning tree for a weighted undirected graph that is connected. The mechanics are very similar to the Dijkstra minimum-path algorithm. The iterative process begins with any starting vertex and maintains two variables minSpanTreeSize and minSpanTreeWeight, which have initial value 0. Each step adds a new vertex to the spanning tree. The process terminates when all of the vertices added to the tree. Adding a vertex also involves adding the edge of minimal weight that connects the vertex to those already in the minimal spanning tree. The weight of the edge updates the variable minSpanTreeWeight.
Let us look at the algorithm for the graph in below figure with A selected as the first vertex in the spanning tree :

We implement Prim's algorithm by using a priority queue of MinInfo objects, much as we do in the Dijkstra minimum-path algorithm. An iterative step inserts an element into the priority queue when there is an edge e(v,w), v is a vertex already in the minimum spanning tree, w is a vertex not in the tree, and adding the edge provides a smaller weight that the weight from any previously discovered edge what will connect a vertex to the spanning tree. The endVfield of a MinInfo object is w, and the pathWeight field is the weight of the edge. We also use the vertex color, data and parent reference properties :
color :

Initially, all vertex are colored WHITE, to indicate that they are not in the spanning tree. When a vertex enters the tree, the color is set to BLACK. The colors BLACK and WHITE distinguish whether vertices are in the spanning tree.

data :

This is the minimum weight of an edge that would connect the vertex to an existing vertex that is already in the spanning tree. As the tree grows, the data value is updated, because there are more and more available edges to connect the vertex to the tree. Initially, the data value is ∞

parent :

This is the source vertex for the minimum edge associated with the data value. The parent reference is a vertex in the spanning tree. Each update to the data value has a corresponding update for the parent reference field.

The following details the setup for the algorithm and the iterative steps. We include a display of MinInfo objects in the priority queue and the status of thecolor and parent fields for each vertex.
Setup :

Create the object MI(A,0) and push it into the priority queue. The object creates an implied edge e = (A,A) with weight 0. At the same time, set the data value to 0 and the parent reference to A.

Step1 :

Pop MI(A,0) from the priority queue and color A BLACK. This has the effect of placing A in the spanning tree. With each deletion, we increment the variable minSpanTreeSize and add the data value to minSpanTreeWeight. The resulting variables have values 1 and 0. We search the adjacent list for A to locate the edges that could connect an adjacent vertex not in the spanning tree to the single vertex already in the tree and improve on the existing weight for connection to the spanning tree. The vertices B, C and D are neighbors of A with edges having weights 2, 12 and 5 respectively. For each neighbor v, create the MinInfo object with v as the ending vertex and the weight of the edge e=(A,v) as the path-weight. Also, update dataValue to the edge weight and set A as the parent.

Step2 :

Remove MinInfo(B,2) from the priority queue. Because vertex B is WHITE (not visited), color it BLACK. The importance of checking whether the vertex is already in the spanning tree will become clear in the comment following Step4. Increment minSpanTreeSize to 2 and update minSpanTreeWeight by adding the dataValue 2. The spanning tree now has two vertices, A and B. We have already taken care of vertex A, we need only look for nonvisited (WHITE) neighbors of B to find additional edges that would connect a vertex to the spanning tree. In the example, D is such a neighbor, with edge (B,D) having weight 8. This is a key point in the minimization algorithm.The dataValue of D is 5. This implies that an edge already exists which would connect D to the existing spanning tree. In fact, it is edge (A,D) from Step1. This is a better (lower-weight) connection than the new edge (B,D), and so we take no action.

Step3 :
Pop MinInfo(D,5) from the priority queue. D is WHITE; color it BLACK. The number of vertices in the spanning tree (minSpanTreeSize) is 3, andminSpanTreeWeight becomes 7 (5+2). The vertex has only one WHITE neighbor, C. The edge (D,C) has weight 7, which is less than the current "best" edge of 12 for C. The new edge is a better choice, and so we create MinInfo(C,7) and update C so its dataValue is 7 and its parent is D. Push the MiniInfo object onto the priority queue.

Step4 :

Pop MinInfo(C,7). The value for miniSpanTreeWeight becomes 17 (7+7). Because minSpanTreeSize now equals the vertex size of the graph, the process terminates. The weight for the minimum spanning tree is 14.

Note that, in Step3, vertex C appears twice in the priority queue. Initially, it enters the queue as a neighbor of A, in which edge (A,C) has weight 12. Once D is in the spanning tree, we look at its neighbors and find a better edge, (D,C), with weight 7. In this step, we pop MinInfo(C,7) and put C into the spanning tree (color it BLACK). In a larger example, a subsequent step may delete MinInfo(C,12) from the priority queue, but C would already be in the spanning tree (BLACK), so we would take no action, because we cannot connect C a second time with weight 12.

Implementing the minSpanTree() Method :
In the minSpanTree() method, we are interested in taking a graph as an argument and deriving both a minimum spanning tree and its total weight. We do this by using a signature with two graph references as arguments and in integer return value. The first argument is the original graph and the second argument is the minimum spanning tree that is created by the method. The return value is the total weight. The following is the implementation of method minSpanTree() :

- Method minSpanTree() in class JGraph :
1. /**
2. * BD :
3. *   Find the minimum spanning tree for the connected graph g; g is represented as a digraph
4. *   with bidirectional edges of equal weight.
5. */
6. public static  int minSpanTree(JGraph g, JGraph MST, T sVertex) {
7.     if(!g.containsVertex(sVertex)) throw new java.lang.IllegalArgumentException("minSpanTree(): Vertices "+sVertex+" doesn't exist");
8.     HeapPQueue> minTreePQ = new HeapPQueue>(new Less());
9.     MinInfo vertexData = null;
10.     HashMap verticesColor = new HashMap();
11.     Iterator iter = g.vertexSet().iterator(), neighborIter;
12.     while(iter.hasNext()) {
13.         T vex = iter.next();
15.         verticesColor.put(vex, VertexColor.WHITE);
16.     }
17.     int minSpanTreeSize = 0;
18.     int minSpanTreeWeight = 0;
19.     g.initData();
20.     g.setData(sVertex, 0);
21.     g.setParent(sVertex, sVertex);
22.     minTreePQ.push(new MinInfo(sVertex, 0));
23.     while(minSpanTreeSize
24.         vertexData = minTreePQ.pop();
25.         neighborIter = g.getNeighbors(vertexData.endV).iterator();
26.         while(neighborIter.hasNext()) {
27.             T neighborVex = neighborIter.next();
28.             if(verticesColor.get(neighborVex).equals(VertexColor.WHITE)) {
29.                 int ow = g.getData(neighborVex);
30.                 if(ow==PathBean.INFINITE) {
31.                     g.setParent(neighborVex, vertexData.endV);
32.                     g.setData(neighborVex, g.getWeight(vertexData.endV, neighborVex));
33.                 } else {
34.                     int nw = g.getWeight(vertexData.endV, neighborVex);
35.                     if(nw
36.                         g.setParent(neighborVex, vertexData.endV);
37.                         g.setData(neighborVex, nw);
38.                     }
39.                 }
40.                 minTreePQ.push(new MinInfo(neighborVex, g.getData(neighborVex)));
41.             }
42.         }
43.         if(verticesColor.get(vertexData.endV).equals(VertexColor.WHITE)) {
44.             minSpanTreeSize++;
45.             minSpanTreeWeight+=g.getData(vertexData.endV);
46.             verticesColor.put(vertexData.endV, VertexColor.BLACK);
47.         }
48.     }
49.     // Build up the spanning tree graph
50.     iter = g.vertexSet().iterator();
51.     int edgeWeight;
52.     while(iter.hasNext()) {
53.         T vex = iter.next();
54.         if(!vex.equals(sVertex)) {
55.             T parent = g.getParent(vex);
56.             edgeWeight = g.getWeight(vex, parent);
57.             MST.addEdge(vex, parent, edgeWeight);
58.             MST.addEdge(parent, vex, edgeWeight);
59.         }
60.     }
61.     return minSpanTreeWeight;
62. }

- Running Time Analysis
Prim's algorithm is just a variation of Dijkstra's algorithm, so its running time is O(V + E log2 V).

Supplement :
* [ 資料結構 小學堂 ] 圖形結構 : 擴張樹
* [ 資料結構 小學堂 ] 圖形結構 : 擴張樹 - 求最小成本擴張樹 (Kruskal 演算法)

## 關於我自己

Where there is a will, there is a way!