程式扎記: [ Data Structures with Java ] Section 15.2 : The Radix Sort

標籤

2011年10月26日 星期三

[ Data Structures with Java ] Section 15.2 : The Radix Sort

Preface :
Up to this point, we have discussed a variety of sorting algorithm. The elementary sorting algorithm selection sort and insertion sort are quadratic algorithms that use a brute-force method to order the elements. More advanced sorting algorithms mergesort and quicksort use a divide-and-conquer strategy that results in average runtime efficiency O(n log2 n). In this section, we will introduce a new design method for ordering an array of integers. The method, called the radix sort, is a linear algorithm that uses successive digits in the elements to partially sort the array. The technique is two-stage process refereed to astransform-and-conquer.
At each step, the transform stage passes the elements trough a series of queues that correspond to the values at a specified digit. The conquer stage build a progressively more ordered array by extracting the elements from the queues. The radix sort algorithm works by comparing individual pieces of an element rather than the element in its entirety. In out example, the pieces are digits in an integer. The radix sort algorithm can be used with other types of data in which comparison uses individual bits or sequences of bits.
The radix sort has been used since the early days of computing when data was stored on punched cards. To order the data, an operator ran the cards through a mechanical sorter. For integer data, the machine dropped each card into one of ten bins that represented the digits 0-9. Each bin was a queue in which a card entered at the back and exited at the front. The mechanical sorter implemented the radix sort algorithm. To explain the process, we assume that the cards contain two-digit numbers in the range 00-99. The numbers (cards) pass through the machine twice to separate the data first by the 1's digit and then by the 10's digit. Each pass involves first distributing the cards into the bins and then collecting them back into a sequence. For more detail, please refer to below figure which demonstrate the process :


Radix Sort Algorithm :
We can extend the sorting process from two-digit numbers to d-digit numbers, d>=1, by performing d passes. Recall that a base-10 number with d digits is a sum of products of its digits with powers 10. The first pass sorts by the 1's digit (power 10^1), the second by the 10's digit (power 10^1) and so forth. The static method radixSort() in the class Arrays implements the algorithm. It requires as parameters an integer array arr and the maximum number of digits dof any integer in the array :
public static void radixSort(int[] arr, int d) {...}

The private method distribute() implements the distribution of the numbers into the 10 queues. The parameters include the array of integers, the queue, and the power that designates which digit define the allocation of a number to a queue :
protected static void distribute(int[] arr, LinkedQueue digitQueues[], int power) {...}

After numbers are distributed into the queues, the private method collect() scans the array of queues in the order from 0 to 9 and moves all items from the queues back into array :
protected static void collect(LinkedQueue digitQueues[], int[] arr) {...}

Below is the implementation of radix sort algorithm from class Arrays : 

- Radix sort algorithm Implementation :
  1. public class Arrays {  
  2. ...  
  3.     public static void radixSort(int[] arr, int d) {  
  4.         int power = 1;  
  5.         LinkedQueue[] digitQueues= new LinkedQueue[10];  
  6.         for(int i=0; i<=9; i++) digitQueues[i] = new LinkedQueue();   
  7.         for(int i=0; i
  8.             distribute(arr, digitQueues, power);  
  9.             collect(digitQueues, arr);  
  10.             power*=10;  
  11.         }  
  12.     }  
  13.       
  14.     protected static void distribute(int[] arr, LinkedQueue digitQueues[], int power) {  
  15.         for(int i=0; i
  16.             digitQueues[(arr[i]/power%10)].push(arr[i]);              
  17.         }  
  18.     }  
  19.       
  20.     protected static void collect(LinkedQueue digitQueues[], int[] arr) {  
  21.         int index=0;  
  22.         for(int i=0; i<=9; i++) {  
  23.             LinkedQueue digitQueue = digitQueues[i];  
  24.             while(!digitQueue.isEmpty()) {  
  25.                 arr[index] = digitQueue.pop();  
  26.                 index++;  
  27.             }  
  28.         }             
  29.     }  
  30. ...  
  31. }  

Efficiency of the Radix Sort :
The radix sort orders an array of size n that contains d-digit integers. Each pass inserts the n numbers into the ten queues (bins) and then collects the numbers from the queues. The algorithm performs thees 2*n queue operations d times. Section 15.1 discusses the fact that queue push() and pop() operations have running time O(1). It allows that the complexity of the radix sort for d-digit numbers is O(2*d*n) = O(n). Radix sort appears to be superior to most in-place algorithms that reorder the data within original sequence and do not use temporary storage. The in-place sorts have average-case running time O(n^2) or O(n log2 n). The "superior" quality is deceiving because one measure of the efficiency of an algorithm is the amount of memory it requires. The radix sort requires extra space proportional to the size of the array that is being sorted. For very large data sets, the extra storage becomes a real liability.
Another issue is the number of bins (queues) that are required. With integers, 10 queues are sufficient, but the algorithm would require almost 100 separate queues for arbitrary strings. The number of bins does affect the efficiency of the radix sort.
The more general purpose O(n log2 n) algorithm, such as quick sort, adapt to a broader variety of applications than radix sort. They don't assume values in the array can be effectively partitioned into pieces. On a more technical note, the efficiency of the radix sort, as compared to that of an O(n log2 n) sort, can depend on features of the computer hardware. For instance, some systems are more efficient with the division operation and the copying of large blocks of data.

沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!