程式扎記: [ Data Structures with Java ] Section 16.5 : A Lower Bound for Sorting (Optional)

標籤

2011年3月14日 星期一

[ Data Structures with Java ] Section 16.5 : A Lower Bound for Sorting (Optional)


Preface :
In previous chapters, we develop a series of sorting algorithms that use comparisons to order the elements. The quadratic sorting algorithms such as insertion sort, bubble sort and exchange sort compare adjacent elements and all have worst- and average-case running time O(n^2). The more sophisticated sorting algorithms, such as quicksort and mergesort, also perform comparisons of array elements and have average-case running time of O(n log2n). To improve on quadratic running time, Section 7.2 discusses the fact that a sorting algorithm must compare nonadjacent elements. Are there as yet undiscovered sorting algorithms that use comparisons and have worst- and average-case running time better than O(n log2n)? Since sorting is so important, it would pay to search for such an algorithm.

Lower Bound for Sorting :
We answer the question by considering a decision tree for a sorting algorithm. Such a tree traces all the possible paths that occur when a sorting algorithm executes. Each interior node has the format "elt1:elt2", which means that we perform the test elt1 < elt2. The two edges from the node represent the results of the comparison, elt1 < elt2 or elt2 < elt1. The leaf nodes contains the final sorted sequence.
As an example, we can use the selection sort and construct the decision tree for sorting the 3-elements array {a,b,c}. For simplicity, assume that the array contains no duplicate. The first comparison determines whether a
Continuing the first pass, we compare a and c or b and c :

At this point, the first pass concludes and we exchange the smallest element with a. In the second pass, we must compare the elements at position 1 and 2 of the array. Below figure shows the final decision tree :

Assume we sort n elements. Since each leaf node represents a sorted sequence, there must be at least n! leaf nodes in the decision tree for the sorting algorithm. From the decision tree in upper figure, there are eight leaf nodes. Note that the first six (3!) enumerate the possible sorted orders. We can use the decision tree to obtain a lower bound on the running time for any sorting algorithm that uses comparisons. This is one of the most important result in computer science. To develop the result, we must first note a general fact for a binary tree :

Any sorting algorithm that uses comparisons has a decision tree, although it is not practical to construct it. As we have noted, the decision tree has at least n! leaf nodes, each one representing a possible ordering of the elements. If follows from our general fact that :

Where d is the depth of the decision tree. Since the log2(x) is an increasing function, we can take the logarithm of both sides of the inequality :

The worst case for the algorithm is to traverse a path to the deepest point in the decision tree; that is, to execute d comparisons. This tells us that the worst-case running time, T(n) satisfies the inequality :

It can be shown that for large n, n! is approximately equal to :

This is known as Stirling's approximation for n!. It can actually be shown that the following bound holds for all n :

If we omit the square root factor, we conclude that :

This result implies :

The term nlog2n-nlog2e is a lower bound for the running time of any sorting algorithm that uses comparisons. There is a commonly used notation for a result of this type. If the average- and worst-case performance of an algorithm is no better than g(n), we say that the algorithm has running time Ω (g(n)). Ignoring the insignificant term nlog2e, we have shown that any sorting algorithm that uses comparisons has worst-case running time Ω (nlog2n). 
This message was edited 7 times. Last update was at 25/02/2011 13:28:16

沒有留言:

張貼留言

網誌存檔