程式扎記: [ Data Structures with Java ] Section 7.3 : Quicksort

標籤

2011年1月23日 星期日

[ Data Structures with Java ] Section 7.3 : Quicksort

前言 : 
The famous quicksort algorithm uses another form of the divide-and-conquer strategy to order an array. The algorithm, discovered by C.A.R Hoare, uses a series of recursive calls to partition a list into smaller and smaller sublists about a value called the pivot. Each step chooses as the pivot the value at the midpoint of the list. The partitioning algorithm performs exchanges so that the pivot value is placed in its correct final position in the list after the elements are ultimately sorted. Meanwhile, the lower sublist has only elements that are less than or equal to the pivot, and the upper sublist has only elements that are greater than or equal to the pivot. 
Mergesort and quicksort use very different approaches to ordering an array. In mergesort, a recursive descent partitions that array into progressively smaller and smaller half-sublists. Merging the sublists during the recursive ascent phase phase of the algorithm creates the order. The original array is ordered only after the final merge, which corresponds to the first recursive step. Quicksort, on the other hand, orders the pivot value in the list and then recursively moves to the resulting lower and upper sublists. The original array is ordered after completing the recursive descent. Note also that the mergesort orders elements by copying them to and from a temporary array. Quicksort is an "in-place" sorting algorithm, because it reorders the sequence by exchanging elements within the list. 

Partitioning a List with a Pivot : 
The best way to understand the overall design of the quicksort algorithm is by working through an example. You understand the key features of the algorithm by viewing in detail a single recursive step. Let arr be an array containing 10 integer values : 
arr: {800, 150, 300, 650, 550, 500, 400, 350, 450, 900} 

For the first recursive step, the list includes all of the elements in array arr. The index range [first, last) is [0, 10). The partitioning process uses the pivot value at the mid-point of the range : 

// pivot = arr[mid] where mid = (last + first)/2
mid = (0+10)/2 = 5
pivot = arr[5] = 500;

The algorithm separates the elements of arr into two sublists, S1 and S2. Sublist S1 is the lower sublist and contains the elements that are less than or equal to the pivot. The higher sublist, S2, contains the elements that are greater than or equal to the pivot. The pivot 500 is the value that will ultimately lie between the two sublists. We begin by exchanging the pivot with the first element 800. The exchange of arr[first] and arr[mid] has the effect of moving the pivot to the front of the list and setting up a scan of the list with index range [first+1, last). The scan uses two indices, scanUp and scanDown. The indexscanUp starts at position first+1 and moves up the list, identifying the element in sublist S1. The index scanDown starts at position last-1 and moves down the list, identifying elements in sublist S2. 
 
 

QuickSort Recursive Descent : 
The quicksort partition process indicates the index for the pivot value and the sublist S1 and S2. Let the variable pivotLoc denote the index of pivot. Recursive steps continue the partitioning process on S1 with index range [first, pivotLoc) and S2 with index range [pivotLoc+1, last) respectively. Let us continue the example for those two sublists. Below figure displays the full set of recursive steps, which partition the list and locate the pivot value. The position of pivot (pivotLoc) is the shaded element : 
 
After completing the recursive descent of the progressively smaller sublists, all elements were a pivot at some point or in a 1-element sublist and thus placed in their proper location. The resulting array is ordered. 

The pivotIndex Method : 
The recursive step in the quicksort algorithm reorders elements in the index range [first, last) about a pivot value. We provide the method pivotIndex() to perform this task. The method takes an array arr of generic type and two indices first and last specifying the index range. The method selects the value of the midpoint of the array as the pivot. The return value is the final location (index) of the pivot value after partitioning the list into two sublists consisting of elements that are less than or equal to the pivot and elements that are greater than or equal to the pivot : 

- 函數 pivotIndex 範例代碼 :
  1. public static extends Comparablesuper T>> void swap(T[] arr, int p1, int p2){  
  2.     T temp = arr[p1];  
  3.     arr[p1] = arr[p2];  
  4.     arr[p2] = temp;  
  5. }  
  6.   
  7. public static extends Comparablesuper T>> int pivotIndex(T[] arr, int first, int last){  
  8.     int size = last - first;  
  9.     if(size==0) {        // empty sublist  
  10.         return last;       
  11.     } else if(size==1) { // 1-element sublist  
  12.         return first;  
  13.     } else if(size==2){  // 2-elements sublist    
  14.         if(arr[first].compareTo(arr[first+1])>0){  
  15.             swap(arr, first, first+1);  
  16.         }  
  17.     } else if(size >0){  
  18.         int mid = (first+last)/2;  
  19.         swap(arr, first, mid); // exchange the pivot and first element.  
  20.         int scanUp=first+1, scanDown=last-1;  
  21.         while(scanUp <= scanDown && scanDown>first && scanUp
  22.             if(arr[scanUp].compareTo(arr[first])<=0) {  
  23.                 scanUp++;  
  24.                 continue;  
  25.             }  
  26.             if(arr[scanDown].compareTo(arr[first])>=0){  
  27.                 scanDown--;  
  28.                 continue;  
  29.             }  
  30.             swap(arr, scanUp, scanDown);  
  31.             scanUp++;  
  32.             scanDown--;  
  33.         }  
  34.         swap(arr, first, scanDown);           
  35.         return scanDown;  
  36.     }  
  37.     return (first+last)/2;  
  38. }  

The quicksort() method : 
The method quicksort() provides a user with simple access to the quicksort algorithm. It takes a single array arr of generic type as its argument and executes the sort by calling the private method qsort() with the array and indices 0 and arr.length as arguements : 

  1. public static extends Comparablesuper T>> void quickSort(T[] arr) {  
  2.     qsort(arr, 0, arr.length);  
  3. }  
The method qsort() implements the recursive quicksort algorithm. The method takes an array arr of generic type and two integers first and last, which specify an index range. The design for qsort() recursively partitions the elements in the index range into smaller and smaller sublists. The process terminates when the size of a list is 0 or 1. These are stopping conditions because such a list is obviously ordered. For efficiency, we handle a list of size 2 by simply comparing the elements and making an exchange if necessary : 

- 函數 qsort() 範例代碼 :
  1. public static extends Comparablesuper T>> void qsort(T[] arr, int first, int last) {  
  2.     if((last-first)<=1)  
  3.         return;  
  4.     else if((last-first)==2) {  
  5.         if(arr[first].compareTo(arr[first+1])>0)  
  6.             swap(arr, first, first+1);  
  7.         return;  
  8.     } else {  
  9.         int pivotIndex = pivotIndex(arr, first, last);  
  10.         if(pivotIndex>first)qsort(arr, first, pivotIndex);  
  11.         if((pivotIndex+1)1, last);  
  12.     }             
  13. }  

Running Time of Quicksort : 
To measure the efficiency of the quicksort() algorithm, assume that n is a power of 2 : n=2^k (k=log2(n)). In addition, assume that the pivot lies in the middle of each list, so the quicksort partitions the list into two equal-sized sublists. Under these rather ideal circumstances, we can get a handle on the number of comparisons. 
For partition level 0, these are approximately n comparisons (n-1, to be exact). The result of the process creates two sublists of approximate size n/2. For partition level 1, there are two sublists. Partitioning each sublist requires approximately n/2 comparisons. The total comparison for the two sublists is 2*(n/2) = n. Eventually, the partition process terminates after k levels with sublist having size 1(This level doesn't require comparison!). The total number of comparisons is approximately : 
 

The ideal case we have discussed is actually realized when the array is already sorted in ascending order. In this case, the pivot is precisely in the middle of the list. For an arbitrary array, a mathematical calculation verifies that the average running time, T(n) for quicksort satisfes the relation : 
 

This shows that the average running time for quicksort() is O(n log2(n)). 
The worst-case scenario for quicksort() occurs when pivot consistently splits off a one-element sublist and leaves the rest of the elements in the second sublist. This occurs, for example, when the pivot is always the smallest or the largest element in its sublist. For example, the data 5, 3, 1, 2, 9 exhibit this behavior. In partition level 0, there are n-1 comparisons, and the large sublist contains n-1 elements. In partition level1, there are n-2 comparisons, and the large sublist contains n-3 elements, and so forth. The total number of comparisons is : 
 

And the complexity is O(n^2), which is no better than the selection or insertion sort. However, the worst case is unlikely to occur in practice. In general, the overall performance of quicksort is superior to all the other general-purpose sorts we have discussed. 

沒有留言:

張貼留言

網誌存檔