程式扎記: [ DS with Java ] Section 4.2 : Simple Search Algorithms

標籤

2010年12月9日 星期四

[ DS with Java ] Section 4.2 : Simple Search Algorithms

Preface : 
Array search algorithms start with a target value and employ some indexing strategy to visit the elements looking for a match. If the target is found, the corresponding index of the matching element becomes the return value. Otherwise, the return value indicates that the target is not found. A search frequently wants to cover the entire array. A more general algorithm allows for searches in a sublist of the array. A sublist is a sequence of elements whose range of indices begin at index first and continue up to, but not including, index last. We denote a sublist by its index range [first, last). 

Sequence Search : 
The simplest search algorithm for an array is the sequential search. The algorithm begins with a target value and indices that define the sublist range. It scans the elements in the sublist item by item, looking for the first occurrence of a match with an item called target. If successful, the algorithm returns the index of the match. The search is not successful if the scan reaches the index last before matching the target. In this case, the algorithm returns -1. 

The seqSearch() Method : 
We develop the method seqSearch() for an integer array as a static method in the class Arrays. As we progresses, we will develop additional searching algorithms and add them to this class. For arguments, the method has an array arr, the two indices first and last that specify the index range, and thetarget value. The return value is an index of type int. To illustrate the action of seqSearch(), consider the integer array : 
int[] arr = {6, 4, 2, 9, 5, 3, 10, 7}; 

The below figure illustrates a search of the entire array for target 3 and a search for the sublist [2,5) for target : 
 

The implementation of seqSearch() uses a for-loop to scan the array elements in the range [first, last). Iterations continue so long as the index is in range and no match is found. The scan terminates on a match and the index becomes the return value. Otherwise, -1 is the return value. 

- Arrays.java (partly) :
  1. public static int seqSearch(int[] arr, int first, int last, int target) {  
  2.         // scan indices in the range first <=i < last;  
  3.         // return the index indicating the position if a match occurs;  
  4.         // otherwise return -1  
  5.         for(int i=first; i
  6.             if(arr[i]==target) return i;  
  7.         }  
  8.         // no return yet if match is not found ; return -1  
  9.         return -1;  
  10.     }  
  11.       
  12.     public static void main(String args[]) {  
  13.         int[] list = {5310089565};  
  14.         System.out.println(Arrays.seqSearch(list, 0, list.length, 6));  
  15.         System.out.println(Arrays.seqSearch(list, 35100));  
  16.         int index=0;  
  17.         System.out.print("5 occurs at indices ");  
  18.         while((index=Arrays.seqSearch(list, index, list.length, 5))!=-1) {  
  19.             System.out.print(index+" ");  
  20.             index++;  
  21.         }  
  22.     }  


Output :
5
-1
5 occurs at indices 0 4 6

Binary Search : 
The sequential search is a general search algorithm that applies to any array. It methodically scans the elements one after another until it finds a match or reaches the end of the range. You can employ a more efficient search strategy for an ordered array. You use the strategy when looking up a phone number in a telephone book. Suppose you want to find the number for "Swanson". A phone book maintains names in alphabetical order, so you look for "Swanson" somewhere near the back of the book. This approach is clearly superior to a sequential search, which would laboriously scan the "A"s, then the "B"s, and subsequent letters in the alphabet. Our problem is to turn the rather haphazard phone number lookup strategy into an algorithm that will work for any ordered array. The solution is the binary search algorithm. Given a target value, the algorithm begins the search by selecting the midpoint in the list. If the value at the midpoint matches the target, the search is done. Otherwise, because the list is ordered, the search should continue in the first half of the list (lower sublist) or in the second half sublist of the list (upper sublist). If the target is less than the current midpoint value, look in the lower sublist by selecting its midpoint; otherwise, look in the upper sublist by selecting its midpoint. You can continue the process by looking at midpoints for ever smaller and smaller sublists. Eventually, the process either finds the target value or reduce the size of the sublist to 0, which is the criterion that the target is not in the list. 
To get a more formal understanding of the binary search algorithm, we need to specify the meaning of midpoint and lower and upper sublist in terms of array indices. Assume arr is an array with n items and that the search looks for the item called target. The indices for the full list are in the index range range [0,n). Start the search process by computing the middle index for the range [first, last), and then assign the value at the index to the variable midValue : 
mid = (first + last) /2; //middle index for [first, last) 
midValue = arr[mid]; //save value in midValue 

Compare midValue with target. Three possible outcomes can occur, with trigger three separate actions : 

Case 1. A match occurs. The search is complete and mid is the index that locates target.
Case 2. The value target is less than midValue, and the search must continue in the lower sublist. The index range for this sublist is [first, mid). Reposition the index last to the end of the sublist(last = mid).
Case 3. The value target is greater than midValue and the search must continue in the upper sublist. The index range for this sublist is [mid+1, last), because the sublist begins immediately to the right of mid. Reposition the index first to the front of the sublist (first = mid+1).

The binary search terminates when a match is found or when the sublist to be searched is empty. An empty sublist occurs when first >= last.

Below figure gives a snapshots of the binary search algorithm as it looks for a target in the nine-element integer array, arr. 
 

The binSearch Method : 
The static method binSearch() in the Arrays class implements the binary search algorithm for an integer array. The method has four arguments that identify the array, the indices first and last that specify the index range, and the target. The method returns the array index that identifies the first occurrence of a match of -1 if the target is not found. The implementation uses an iterative process on progressively smaller sublists [first, last). Iteration continues so long as the sublist is not empty (first < last) and no match occurs. After determines the middle index of the range and the corresponding array value, multiple selection statements compare the value with the target and treat the three possible outcomes : 

- Arrays.java (partly) :
  1. public static int binSearch(int arr[], int first, int last, int target) {  
  2.         if(first
  3.             int mid = (first+last)/2;  
  4.             int midValue = arr[mid];  
  5.             if(target == midValue){return mid;}  
  6.             else if(target > midValue){return binSearch(arr, mid+1, last, target);}  
  7.             else {return binSearch(arr, first, mid, target);}  
  8.         }  
  9.         return -1;  
  10.     }  
  11.       
  12.     public static void main(String args[]) {  
  13.         int[] arr = {112033404557606779};  
  14.         // Search entire list for 60; first = 0, last = arr.length  
  15.         System.out.println(Arrays.binSearch(arr, 0, arr.length, 60)); // index = 6  
  16.         // Search the index range [3, 7) for 20  
  17.         System.out.println(Arrays.binSearch(arr, 3720)); // index = -1        
  18.     }  


Output :
6
-1

Supplement : 
* [ 資料結構 小學堂 ] 搜尋 : 二元搜尋法

沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!