程式扎記: [ Data Structures with Java ] Section 22.4 : Sorting with a Heap

標籤

2011年4月18日 星期一

[ Data Structures with Java ] Section 22.4 : Sorting with a Heap

Preface : 
Recall the selection sort algorithm that orders elements in array arr of size n in ascending order. A traditional approach describes the algorithm as a series of iterative steps that place the correct element in index 0, then index 1 and so forth. A modification of the algorithm works in the opposite direction. An iteration at index n-1 searches the sublist from arr[0] to arr[n-1] looking for the largest element and exchanges it with arr[n-1]. The next iteration, at index n-2 searches for the largest element in the sublist from arr[0] to arr[n-2] and exchanges it with arr[n-2]. The process concludes at index 0. The selection sort is a brute-force O(n^2) algorithm because each iteration involves an O(n) scan to locate the largest element. 
If the original array a heap, a simpler and more efficient from the selection sort can be used. For each iteration i, the largest element in the sublist from arr[0] to arr is arr[0]. Simply exchange this largest element with arr[i] and then reorder the array so that elements in the index range [0,i) are a heap. You should recognize that the exchange and reordering is precisely the popHeap() algorithm. The efficiency that is gained when the array is a heap is obvious. At each iteration, the largest (maximum) element is at a specified location and no search is required. The only overhead involves reestablishing the sublist as a heap. In this section, we develop this "sorting with a heap" algorithm, which is appropriately called the [i]heapsort. 

Making a Heap : 
Some applications, like the heapsort, start with an array and need to reorder the elements so that the corresponding array-based tree is a heap. The process is called "heapifying" the array. We use the reordering algorithm to implement the static method makeHeap() in the Heaps class. The parameter list includes the array and a Comparator : 
public static void makeHeap(T[] arr, Comparator comp); 

The algorithm is greatly simplified by the fact that all leaf nodes satisfy heap order (because they have no children). So makeHeap() looks only at non-leaf nodes. Heapifying the array begins with the last interior node, which is in fact the parent of the last element in the array. Because the last node has index n-1, where n is the length of the array, its parent is at index : 
currPos = ((n-1) - 1) / 2 = (n-2) / 2 

Processing the interior nodes begins with the last interior node and then continues sequentially back to the root node at index 0. Each interior node is handled the same way. We first check to see if the value of the node is greater than or equal to its children. If the condition is true, the node satisfies heap order and we proceed to the next smaller index. If the condition is not true, however, the node must be positioned somewhere in the path of largest children. The reordering operation is precisely the task performed by adjustHeap(). 
As an example, consider the following Integer array shown in the figure. The shaded nodes are leaf nodes with indices 5,6,...,9. The last interior node appears at index : 
currPos = (10-2)/2 = 4 

Method makeHeap() will apply adjustHeap() to successive nodes at indices 4,3,...0. : 
Integer[] arr = {9, 12, 17, 30, 50, 20, 60, 65, 4, 19} 

We illustrate the process of "heapifying" the array into a maximum heap. Each picture highlights the subtree that is accessed by the method adjustHeap(). 
 
- Locate element at index 4 : 

The value arr[4] = 50 is greater than its child arr[9]=19, and the heap condition for the subtree is satisfied. No exchanges occur. (Figure a)

- Locate element at index 3 : 

The value arr[3]=30 is less than its child arr[7]=65 and must exchange with its child (Figure b)

- Locate element at index 2 : 

The value arr[2]=17 is less than its child arr[6]=60 and must exchange with child (Figure c)

- Locate element at index 1 : 

The arr[1]=12 is less than its child arr[3]=65 and must exchange with the child. The value 12 must subsequently exchange 30 so that the heap condition is satisfied for the subtree(Figure d)

- Locate element at index 0 : 

The process terminates at the root node. The value arr[0]=9 must exchange positions with its child arr[1]=65 and then continue down two more levels. The resulting tree is a maximum heap (Figure e).

The implementation of makeHeap() makes an array and a Comparator object as arguments. The comparator indicates whether the heapification creates a maximum or a minimum heap. Use a comparator Greater to create a maximum heap and Less to create a minimum heap. The code makes repeated calls to adjustHeap(), starting at the index of the first non-leaf node. Decrement the index down to 0(root). 

- Method makeHeap() in class Heaps :
  1. public static  void makeHeap(T[] arr, Comparatorsuper T> comp) {  
  2.     int lastPos=arr.length, heapPos = (lastPos-2)/2;  
  3.     while(heapPos>=0) {  
  4.         adjustHeap(arr, heapPos, lastPos, comp);  
  5.         heapPos--;  
  6.     }  
  7. }  

The Heapsort : 
The heapsort is a modified version of the selection sort for an array arr that is a heap. Iterations handle indices in order from n-1 down to 1. For each index i, the iteration exchanges the element of highest priority, which is arr[0], with arr[i] and then reheapifies the array in the index [0,i). The task at each iteration is precisely the algorithm popHeap(), which pops the highest priority from the heap and assigns it at the rear of the array. With a maximum heap, the first call to popHeap(), with last = n, copies the largest element in the list to arr[n-1]. The next call copies the second largest element to arr[n-2], and so forth. The array is sorted in ascending order. A minimum heap sorts the array in descending order. 
The following is an implementation of the heapsort algorithm. The first instruction calls makeHeap() to build the array as a heap. You can find the method heapSort() in the Arrays class : 

- Method heapSort() in class Arrays :
  1. public static void heapSort(T[] arr, Comparatorsuper T> comp) {  
  2.     Heaps.makeHeap(arr, comp);  
  3.     for(int i=arr.length; i>1; i--) {  
  4.         Heaps.popHeap(arr, i, comp);  
  5.     }  
  6. }  

- Example 22.3 
This example illustrates the heapsort for an array with integer values in the range 0-9. By calling heapSort() with a Greater comparator object, the array is sorted in ascending order. A second call, with the comparator Less, sorts the list in descending order. After each sort, the list is out. 

- Example code :
  1. public static void main(){  
  2.     Integer[] arr = {7190824365};  
  3.     Arrays.heapSort(arr, new Greater());  
  4.     System.out.println("Sort (ascending): "+Arrays.toString(arr));  
  5.     Arrays.heapSort(arr, new Less());  
  6.     System.out.println("Sort (decending): "+Arrays.toString(arr));  
  7. }  

Output : 

Sort (ascending): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Sort (decending): [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Computational Efficiency of makeHeap() and Heapsort : 
It would appear that the running time of makeHeap() is O(n log2 n) because it performs n/2 filter-down operations, each with worst-case running time O(log2n). However, a mathematical analysis actually shows that the worst-case running time of makeHeap() is O(n). 
An n-element array corresponds to a complete binary tree of depth k = int(log2n). During the second phase of the heapsort, popHeap() executes n-1 times. Each operation has efficiency O(log2n). Because makeHeap() is an O(n) operation, the worst-case complexity of the heapsort is : 
O(n) + O(n log2 n) = O(n log2 n). 

The heapsort does not require any additional storage, so it is an in-place sort. Some O(n log2 n) sorts have a worst-case behavior O(n^2). The quicksort that we discussed in Chapter 7 is an example. In contrast, the heapsort has worst-case complexity O(n log2 n), regardless of the initial distribution of the data!

沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!