## 2011年5月2日 星期一

### [ Data Structures with Java ] Section 24.3 : Graph-Traversal Algorithms

Preface :
In Chapter 16, we developed a series of scanning algorithms for binary trees. The algorithms described an iterative level-order scan and recursive preorder, inorder and postorder scans of the tree. By beginning at the root, the algorithms visit each node of the tree exactly once. Traversing a graph is more involved. Graphs do not have a vertex, like a root, that initiates unique paths to each of the vertices. From any starting vertex in a graph, it might not be possible to search all of the vertices. In addition, a graph could have a cycle that results in multiple visits to a vertex. To avoid this from happening, the search algorithms need a strategy to mark a vertex once it has been visited. As you will see, we color the vertices WHITE, GRAY or BLACK to implement the strategy.
Traditionally, graph-traversal algorithms are termed search algorithms. In some cases, the scan has the traditional purpose of looking for a target value. In other cases, the search determines the relationships between vertices and their edges. In this book, we will use the terms graph-search algorithms andgraph-traversal algorithms interchangeably. The algorithm reduce to two standard methods, the breadth-first search and depth-first search. The breadth-first search visits vertices in the order of their path length from a starting vertex. The search involves only the vertices that are path connected (reachable) from the starting vertex. In some cases, this list is a subset of the full set of vertex. The depth-first search traverses all the vertices of a graph by making a series of recursive calls that follow paths through the graph. The fact that the latter search visits all of the vertices is notable. It does this by executing a series of depth-first visits, which carry out the recursive process assuming a starting vertex. The depth-first visit may include only a subset of the graph vertices. The depth-first search repeats depth-first visits with starting vertices chosen from the list of unvisited vertices. A chair of depth-first eventually includes all of the graph vertices. All of the these details will make sense when we look at graph applications.

The breadth-first search is modeled after the level-order scan in a binary tree with a notable difference. The tree scan always begins at the root. A graph scan is relative to a specified starting vertex and visits occur only with vertices that are path connected (reachable) from the starting vertex. Let us illustrate the process with the graph in below figure. We assume the search originates with vertex A. The first vertex visited is A, which has a path length 0 from itself. We then visit each of its neighbors and so the ordering of the visits may vary. Assume the order of the visit for the first four vertices is :
A, B, C, G // A has path length 0. From A, B,C and G have path length 1

From vertex B, we visit its neighbor D with path length 2. The search concludes by visiting vertices E and F, which are the neighbors of D with path length 3 from A. In all cases, the breadth-first search visits vertices in the order of their path length from A. Since the search does not specify an order of visit to vertices at the same path length, there is no unique breadth-first search protocol. Given our assumptions for this example, the breadth-first visits the vertices in the order :
A, B, C, G, D, E, F

The breadth-first search of the graph starting at vertex A allows us to visit all of the vertices in the graph. If we had started at vertex D, the breadth-first search would have included only vertices D, E, F and G. A search from vertex E would visit only the one vertex because E doesn't have any adjacent vertices.
As in the level-order scan of a binary tree, we design the breadth-first search as a process that uses a queue to store temporarily the vertices awaiting a visit. At each iterative step, the algorithm pops a vertex from the queue, marks it as visited, and then insert it into a list of visited vertices. The step concludes by placing all unvisited neighbors of the vertex in the queue. In order to maintain information on the "visit" status of each vertex, we associate a color (WHITE, GRAY, BLACK) with each vertex in the graph. Initially, all vertices have color WHITE (unvisited). When a vertex enters the queue, the color is set to GRAY to indicating that it is discovered. Upon removal from the queue, the color is BLACK, indicating that the vertex has been visited. Using the color attributes that we don not visit a vertex more than once during the traversal.
The following are the iterative steps for the graph in upper figure, assuming A is the starting vertex. The initial action pushes A into visitQueue, which is the queue that temporarily stores the vertices :
Step1 :

Pop A from the queue, color it BLACK and insert it into visitList, which is the list of visited vertices. The neighbors of A are vertices B,C and G, which are still colored WHITE. Push the vertices into the queue and color them GRAY. Figure(a) displays the elements in visitList and visitQueue after the completion of the first iteration of the search.

Step2 :

Pop B from the queue and place it in visitList with color BLACK. The only adjacent vertex for B is D. Color D GRAY and add it to the queue. Figure(b)

Step3 :

Pop C and place it in visitList as Figure(c). C has an adjacent vertex G, but the color of this neighbor is nonwhite, indicating it is either visited or in the queue awaiting a visit. In fact, G is in the queue with color GRAY. No new vertices enter the queue, and we are ready for the next step.

Step4-5 :

Continue the process by popping vertex G from the queue and placing it in visitList (Figure d). G has no adjacent vertices, so continue to Step5, which pops D from the queue. The neighbors, E and F, enter the queue (Figure e) :

Step6-7 :

The previous steps identify the key features of the algorithm. In Step6, pop E from the queue and add it to visitList (Figure f). E has no neighbors, so proceed to Step7, which removes F from the queue. The adjacent vertices E and G are already visited (colored BLACK), so the algorithm concludes since the queue is empty (Figure g) :

To implement the breadth-first search, we meed a mechanism for assigning colors to vertices. Vertex colors for the breadth-first search and other graph algorithms are found in the Enum VertexColor. This enum defines the color of a vertex inside the class JGraph. For more detail and concrete implementation, please refer to example 24.2 which demonstrate the whole source code in the static method bfs() which implement the breadth-first search algorithm.

- Running Time for Breadth-First Search
In the breadth-first search algorithm, the method bfs() will visit each vertex once and assigns the color WHITE to it. This is an O(V) operation, while V denotes the number of vertices. Each vertex enters the queue once, at the most, and hence is popped from the queue at most once. Each queue operation has efficiency O(1), so the total running time for queue handling is O(V). When a vertex enters the queue, the algorithm searches its adjacency list. The total number of elements in all of the adjacent lists is E, the number of edges in the graph; the running time to search the lists is at most O(E). The combination of the activities that initialize the graph, process the vertices in the queue, and search the edges determines the running time for the breadth-first search, which is O(V+E).

Depth-First Visit Algorithm :
The concept of depth-first search algorithm begins with the notion of a depth-first visit from a starting vertex. The depth-first visit algorithm is modeled after the recursive postorder scan of a binary tree. In the tree, a node is visited only after visits are made to all of the nodes in its subtree. In the graph, a vertex is visited only after all of the vertices in paths that emanate from the vertex.
As with the breadth-first search, we uses colors to indicate the status of vertices. Initially, all vertices are WHITE. A vertex is colored GRAY when it is first contacted in a recursive descent. Only when the vertex is actually visited does it become BLACK. A depth-first visit begins at a starting vertex and searches down the path of neighbors until it reaches a vertex that has no neighbors or only neighbors that have already been visited. At this point, a visit occurs at the "terminal" vertex. We then backtrack to the previous recursive step and look for another adjacent vertex to launch a scan down its paths. There is no ordering among vertices in an adjacency list, so the paths and hence the order of visits to vertices can vary.
An example illustrates the process. Consider the graph in below figure where a depth-first visit starts at vertex A. We follow the process by noting each vertex when it is first discovered in a recursive descent, when it is recontacted through backtracking, and when it is actually visited. In the figure, the notation d/f describes two integer values d and f that denote the order in which a vertex is discovered (colored GRAY) and visited (colored BLACK) during the search. Graph applications sometimes refer to these values as the "discover time" and "finishing time" for a vertex. The value for d and f will become clear as we trace the depth-first visit :

Discover A:

The recursive process first discovers A and colors it GRAY. Continue the scan with one of the neighbors of A, namely B or C. Assume B is chosen.

Discover B:

Vertex B is discovered and colored GRAY. Move to D, the only neighbor of B.

Discover D:

Like vertex B, color it GRAY. The two neighbors of D are B and E. However, B is GRAY and so it has been discovered even through it is not visited. We know B is accounted for somewhere in a recursive descent from the starting Vertex A. Select E, which is still a WHITE vertex.

Discover E:

Initially, the vertex is colored GRAY. However, it does not have any neighbors, and so we reached a stopping condition. Visit E by coloring it BLACK and then backtrack to the previous vertex.

Backtrack D:

We are again at a stopping condition. The two neighbors of D are either discovered (vertex B) or visited (vertex E). So we visit D and color it BLACK.

Backtrack B:

Like vertex D, we are at a stopping condition and so we visit B by coloring it BLACK.

Backtrack A:

Choose vertex C, which is the other (undiscovered) neighbor of A.

Discover C:

Once we are at C, we note that its only neighbor E has already been visited and so C becomes the next vertex visited (BLACK).

Backtrack A:

This vertex has been involved in the recursive process three times. It was the first vertex to be discovered and the last vertex to be visited. Color it BLACK. The recursive process concludes when the starting vertex is visited.

A depth-first visit returns a list of vertices in the reverse order of their visit or finishing times. In our example, the list is [A,C,B,D,E]. The vertex at the front of the list is the last vertex visited. This is, of course, the starting vertex. The vertex at the back of the list is the first vertex visited. The list is the collection of vertices that are reachable from the starting vertex. We need to appreciate the significance of the ordering of vertices in the list. A vertex in the list is visited only after all of the vertices in the tail of the list are visited. For instance, a visit to C occurs only after visits to vertices reachable from C, and these vertices are found in the tail of the list (vertices B, D and E) and not vertex A. The list indicates that a visit to B results from searching activity involving only vertices D and E, which from the tail of list.

- Discovering a Cycle
Graph algorithms often make use of a depth-first visit. For instance, we can use it to discover the presence of a cycle within the set of reachable vertices. Recall that a cycle is a directed path of length 2 or more that connects a vertex to itself :
P(v,v): v=v1, v2, ..., v(m-1), vm=v // while m>1

We use the coloring of vertices in the recursive depth-first first and look for an edge that connects a vertex to a neighbor that has color GRAY. The edge, called a back edge, links a vertex back to a neighbor that has already been discovered in a previous recursive step. A back edge indicates that the presence of a cycle. More explicitly, a depth-first visit has a cycle if and only if it has a back edge.
To understand why a back edge in a depth-first visit indicates the presence of a cycle, assume the recursive scan identifies the back edge (w,v). Since v is GRAY, we know that at some point in the scan, we discovered v and then proceeded down a path from v that includes w. The length of the path depends on the number of recursive calls from the point when v was first discovered to the current vertex w, which identifies v as a neighbor. If we add edge (w,v) to the path from v to w, the length of the new path is at least two and is a cycle (In below figure) :

Conversely, if a depth-first has a cycle within its path, it will find a back edge. To see this, assume that v is the first vertex in the cycle and that the cycle path is v, ..., w, v. The depth-first visit first discovers v and colors it GRAY. The scan will continue around the cycle and eventually discover w. At this time, the vertex v is GRAY and so (w,v) is a back edge.

- Implementing the Depth-First Visit
The static visit method dfsVisit() in the class JGraph implements the depth-first visit algorithm. The method includes a graph and a WHITE starting vertexsVertex as its first two parameters. A third parameter is the LinkedList dfsList that stores the list of visited vertices. A fourth parameter is the HashMap which records the color of each vertices to visit. The last parameter is the boolean variable checkForCycle. We use this argument only for applications that require an acyclic graph. The method dfsVisit() routinely checks for a cycle. When the method detects one, it checks the boolean flag and if true, throws a runtime exception. A program that uses dfsVisit() can test for a cycle by setting this argument to true and including the method call within a try/catch block. The method descends through the graph, discovering and processing all of the WHITE vertices if finds. For demonstration and implementation on depth-first search, please check Example 24.3.

Acyclic Graphs :
A graph is acyclic if it contains no cycle. This is a global property of the graph. The method dfsVisit() can identify a cycle, but only within the set of vertices that are reachable from a starting vertex. To check for a cycle anywhere in the graph, we traverse all of the vertices by using multiple calls to dfsVisit(). This is essentially the dfs() algorithm, where the focus is on identifying cycles and not no obtaining a list of visited vertices. This approach relies on the fact that any cycle must be included within one of the calls to dfsVisit(). The method, with checkForCycle set to true, identifies the cycle.
To verify this fact, assume that a graph has a cycle that begins and ends at vertex v. The condition implies the existence of a path P(v,v) of length at least 2 :
P(v,v): v=v1, v2, ..., v(m-1), vm=v // while m>1

where vertex v(i+1) is a neighbor of vi. Let vs be the starting vertex for a call to dfsVisit() that reaches v. The depth-first traversal ensures that there is a path P(vs, v) that connects vs to v. By combining the paths P(vs, v) and P(v, v), we note that all of the vertices in the cycle are reachable from vs and thus are visited by dfsVisit() when vs is the starting vertex.
The static method acyclic() in the class JGraph takes a graph as an argument and returns a boolean value indicating whether a cycle is present. The implementation uses the strategy for dfs(). A look searches the vertices of the graph and makes a call to dfsVisit() for each unvisited (WHITE) vertex with the argument checkForCycle set to true. By placing the dfsVisit() call within a try block, acyclic() can catch the exception thrown by dfsVisit() if it finds a cycle and returns false. Otherwise, acyclic() returns true.

Supplement :
* [ 資料結構 小學堂 ] 圖形結構 : 圖形的追蹤

## 關於我自己

Where there is a will, there is a way!